tabulapdf / tabula-java

Extract tables from PDF files
MIT License
1.85k stars 430 forks source link

Error in batch mode (out of bounds) could give more info #329

Open rusosnith opened 5 years ago

rusosnith commented 5 years ago

I´m parsing +300 two-page pdfs. But one of them seems to have only 1 page, so when pointed to scrap page n-2 tabula command line gives the follwing error:

    Exception in thread "main" java.lang.IndexOutOfBoundsException: Page number does not exist
    at technology.tabula.ObjectExtractor.extractPage(ObjectExtractor.java:19)
    at technology.tabula.PageIterator.next(PageIterator.java:29)
    at technology.tabula.CommandLineApp.extractFile(CommandLineApp.java:165)
    at technology.tabula.CommandLineApp.extractFileInto(CommandLineApp.java:143)
    at technology.tabula.CommandLineApp.extractDirectoryTables(CommandLineApp.java:121)
    at technology.tabula.CommandLineApp.extractTables(CommandLineApp.java:97)
    at technology.tabula.CommandLineApp.main(CommandLineApp.java:79)

Giving back the file that caused the error would be so awesome!

jazzido commented 5 years ago

Thanks @rusosnith.

Can you share the exact command line arguments that caused this error?