Use-case: ability to retrieve plain text content for individual announced e-prints

arXiv / zzzArchived_arxiv-fulltext

arXiv plain text extraction

MIT License

42 stars 8 forks source link

As a developer, I want to be able to retrieve plain text content for individual e-prints, so that I can build cool tools and apps that use text mining, classification, etc.

Right now the plain text service is focused on extracting text from PDFs held by the compilation service. We already have a service module for getting the announced PDF (https://github.com/arXiv/arxiv-fulltext/blob/703e8644cf82c09fe99960b6775b0c677f7d1bc5/fulltext/services/pdf.py), and some of the routes already support arXiv e-print IDs. We should test this further to make sure it's working as expected.

arXiv / zzzArchived_arxiv-fulltext

Use-case: ability to retrieve plain text content for individual announced e-prints #20