clowder-framework / extractors-s2orc-pdf2text

Extractor to convert pdf to text
Apache License 2.0
1 stars 0 forks source link

Output csv file as a result #20

Closed minump closed 5 months ago

minump commented 8 months ago

Produce a csv file with fields as below from the json file ['file': str, 'section' : list, 'sentence' : str, 'prev_sentence': str, 'next_sentence' : str, 'tokenized_sentence': list, 'coordinates': str]

This can be the model input