allenai / pdffigures2

Given a scholarly PDF, extract figures, tables, captions, and section titles.
http://pdffigures2.allenai.org/
Apache License 2.0
611 stars 122 forks source link

Batch cli stops after failure in extracting one pdf #26

Open jnzs1836 opened 5 years ago

jnzs1836 commented 5 years ago

When I use batch cli, I found the program stops after failing in extracting one of the pdfs. I wonder if there is any way to ignore the failure and skip to the next pdf?

I am trying to execute single pdf command iteratively but found the command won't save the figures to the disk.

Sparkier commented 4 years ago

What was your error there? I ran into a similar issue with some font missing and patched this so that it would continue after the IllegalArgumentException and basically ignoring that one document.

under-score commented 2 years ago

Same issue here, @Sparkier could you elaborate in more detail? Thank you.

Sparkier commented 2 years ago

This was a while ago and I don't remember this anymore. I am also not really able to reproduce this anymore. If you have an example and tell me where it fails, I might be able to help you figure out the problem.

under-score commented 2 years ago

Million thanks. Just found an undocumented -e switch to ignore errors.