Open bertsky opened 3 years ago
Writing code for PAGE-XML that builds on prima-core-libs is difficult without informative error messages. As a user, I frequently get null even with files that validate perfectly under libxml2 (xmllint, xmlstarlet, Python lxml etc), which is not as strict as the parser used here.
Any news here @chris1010010?
@bertsky, please add an example document which does not open in PageViewer and which can be used to test your pull request.
I sometimes have trouble debugging PAGE-XML documents that just won't open in PageViewer, despite the fact that they validate under the schema and there is no obvious mistake. The problem is that PageViewer won't tell you (except that when it outright crashes, you at least get a stack trace).
Now I digged into
/PrimaDla/src/org/primaresearch/dla/page/io/xml/XmlPageReader.java
and found thatXmlPageReader.read()
does have all the information in aPageErrorHandler
instance calledlastErrors
. But this gets thrown away.Why is this not piggy-backed on an exception which PageViewer's event listener can then react on?
For example, it would help seeing (at least on the console):