allenai / pdffigures2

Given a scholarly PDF, extract figures, tables, captions, and section titles.
http://pdffigures2.allenai.org/
Apache License 2.0
615 stars 123 forks source link

duplicate figure name in saveRasterizedFigures #52

Open liujxing opened 2 years ago

liujxing commented 2 years ago

In FigureExtractorBatchCli.scala, the function saveRasterizedFigures could contain duplicate value in filenames variable, as shown by a few of my tests on some pdf files, and this causes the later generated image to override the previously generated image that shares the same filename, hence duplication is needed. (I cannot submit a pullrequest because I'm not familiar with scala).

MuiseDestiny commented 1 year ago

image

Thank you for the excellent work of the developer. I am trying to integrate this project into a Zotero plugin. I have also encountered the same problem as this issue.