Closed minump closed 1 year ago
Convert the json output from s2orc-pdf2text extractor to a .txt file. Get only "text" from the json output file, concatenate them and write to a .txt file. Upload the .txt file to same dataset in clowder.
This will be built on top of https://github.com/clowder-framework/extractors-s2orc-pdf2text/pull/7
PR merged. Closing this issue.
Convert the json output from s2orc-pdf2text extractor to a .txt file. Get only "text" from the json output file, concatenate them and write to a .txt file. Upload the .txt file to same dataset in clowder.
This will be built on top of https://github.com/clowder-framework/extractors-s2orc-pdf2text/pull/7