HazyResearch / fonduer

A knowledge base construction engine for richly formatted data
https://fonduer.readthedocs.io/
MIT License
409 stars 77 forks source link

How can i extract a paragraph and all associated sentences in document #522

Open ashleo25 opened 4 years ago

ashleo25 commented 4 years ago

How can i extract a paragraph and all associated sentences in document
Basically i need paragraphs with associated sentences @lukehsiao @SenWu @vincentschen @ZZWENG @stephenbach

Appreciate your help

lukehsiao commented 4 years ago

It's not clear to me exactly what you're trying to accomplish, and whether it makes sense to use Fonduer. Fonduer helps extract pre-determined relationships from richly-formatted documents. If you just want the document contents, you might be able to use something simpler, like Poppler's pdftotext instead. Can you explain your use case in detail?