schierlm / BibleMultiConverter

Converter written in Java to convert between different Bible program formats
Other
126 stars 32 forks source link

NumberFormatException to give more context #22

Closed teusbenschop closed 5 years ago

teusbenschop commented 5 years ago

When doing $ ./run in the folder downloaded from http://bibleconsultants.nl/downloads/biblemulticonverter/NumberFormatException/ it gives this exception:

Exception in thread "main" java.lang.NumberFormatException: For input string: ""
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Integer.parseInt(Integer.java:592)
    at java.lang.Integer.parseInt(Integer.java:615)
    at biblemulticonverter.format.paratext.USFM.doImportBook(USFM.java:180)
    at biblemulticonverter.format.paratext.USFM.doImportBook(USFM.java:62)
    at biblemulticonverter.format.paratext.AbstractParatextFormat.doImportAllBooks(AbstractParatextFormat.java:264)
    at biblemulticonverter.format.paratext.AbstractParatextFormat.doImportBooks(AbstractParatextFormat.java:211)
    at biblemulticonverter.format.paratext.AbstractParatextFormat.doImport(AbstractParatextFormat.java:55)
    at biblemulticonverter.Main.main(Main.java:66)

Would it be possible that the exception provides more context? Yes, the cause is malformed USFM, that is the original problem. But if the exception gives more context, it would make it easier for the USFM editor to find the malformed bit in the USFM.

schierlm commented 5 years ago

In the next version, the error will look like this:

Exception in thread "main" java.lang.RuntimeException: Failed parsing 21_Psalms.usfm
    at biblemulticonverter.format.paratext.AbstractParatextFormat.doImportAllBooks(AbstractParatextFormat.java:269)
    at biblemulticonverter.format.paratext.AbstractParatextFormat.doImportBooks(AbstractParatextFormat.java:211)
    at biblemulticonverter.format.paratext.AbstractParatextFormat.doImport(AbstractParatextFormat.java:55)
    at biblemulticonverter.Main.main(Main.java:66)
Caused by: java.lang.NumberFormatException: Invalid chapter number in \c 
    at biblemulticonverter.format.paratext.USFM.doImportBook(USFM.java:181)
    at biblemulticonverter.format.paratext.USFM.doImportBook(USFM.java:62)
    at biblemulticonverter.format.paratext.AbstractParatextFormat.doImportAllBooks(AbstractParatextFormat.java:265)
    ... 3 more

I can understand that having a better validation error (e.g. with line numbers) would be preferred, but the parser currently starts with normalizing white space and therefore all line information is long lost when the parse error is encountered.

This is more or less a unique problem of USFM import format, as other formats are mostly based on XML or another file format where line number preserving parsers and validators are available, therefore the line number tracking does not have to be done by the parser writer. And I don't intend to invest that much more time into the parser just for one "exotic" input format.

I opened #26 to track this further.

teusbenschop commented 5 years ago

Thank you for looking at this! It is going to help the developers who now will see more details about what caused the exception.