PRImA-Research-Lab / prima-page-converter

Command line tool to convert page layout files to the latest PAGE XML format. It supports all previous versions of the PAGE format as well as ALTO XML, FineReader XML, and HOCR
Apache License 2.0
23 stars 6 forks source link

NullPointerException when trying to convert ALTO to PAGE #11

Closed wrznr closed 4 years ago

wrznr commented 4 years ago

I am trying to convert ALTO XML (2.0) to PAGE XML using version 1.5 as follows:

$ java -jar ~/Software/JPageConverter\ 1.5/PageConverter.jar -source-xml FULLTEXT/fileocr00013.xml -target-xml 0013.xml
Could not save target PAGE XML file: 0013.xml
java.lang.NullPointerException
    at org.primaresearch.dla.page.io.xml.PageXmlInputOutput.writePage(PageXmlInputOutput.java:165)
    at org.primaresearch.dla.page.converter.PageConverter.run(PageConverter.java:188)
    at org.primaresearch.dla.page.converter.PageConverter.main(PageConverter.java:103)

ALTO file for testing sample_alto.zip.

Any hints would be highly appreciated!

chris1010010 commented 4 years ago

Hi, the ALTO file is invalid. These are the error messages I get with XML Notepad: image

wrznr commented 4 years ago

Good catch! Many thanks @chris1010010