ad-freiburg / pdfact

A basic tool that extracts the structure from the PDF files of scientific articles.
Apache License 2.0
74 stars 11 forks source link

Cannot extract keywords and abstract from many PDF articles #2

Open tmbahadar opened 6 years ago

tmbahadar commented 6 years ago

Which dataset are you using for experimentation?

fabiojavamarcos commented 5 years ago

PDFAct has trouble in finding the bounding box in some PDF files using two columns format ICEIS_2015_167-xml-out.pdf