Closed teusbenschop closed 5 years ago
In the next version, the error will look like this:
Exception in thread "main" java.lang.RuntimeException: Failed parsing 21_Psalms.usfm
at biblemulticonverter.format.paratext.AbstractParatextFormat.doImportAllBooks(AbstractParatextFormat.java:269)
at biblemulticonverter.format.paratext.AbstractParatextFormat.doImportBooks(AbstractParatextFormat.java:211)
at biblemulticonverter.format.paratext.AbstractParatextFormat.doImport(AbstractParatextFormat.java:55)
at biblemulticonverter.Main.main(Main.java:66)
Caused by: java.lang.NumberFormatException: Invalid chapter number in \c
at biblemulticonverter.format.paratext.USFM.doImportBook(USFM.java:181)
at biblemulticonverter.format.paratext.USFM.doImportBook(USFM.java:62)
at biblemulticonverter.format.paratext.AbstractParatextFormat.doImportAllBooks(AbstractParatextFormat.java:265)
... 3 more
I can understand that having a better validation error (e.g. with line numbers) would be preferred, but the parser currently starts with normalizing white space and therefore all line information is long lost when the parse error is encountered.
This is more or less a unique problem of USFM import format, as other formats are mostly based on XML or another file format where line number preserving parsers and validators are available, therefore the line number tracking does not have to be done by the parser writer. And I don't intend to invest that much more time into the parser just for one "exotic" input format.
I opened #26 to track this further.
Thank you for looking at this! It is going to help the developers who now will see more details about what caused the exception.
When doing
$ ./run
in the folder downloaded from http://bibleconsultants.nl/downloads/biblemulticonverter/NumberFormatException/ it gives this exception:Would it be possible that the exception provides more context? Yes, the cause is malformed USFM, that is the original problem. But if the exception gives more context, it would make it easier for the USFM editor to find the malformed bit in the USFM.