Closed GoogleCodeExporter closed 9 years ago
The approach to solve it is investigate and test the problem to find the
concerned
area inside the OAI Toolkit.
Then would be to catch the error and prevent it from crashing. The OAI Toolkit
could
continue then processing other xml files.
The files might be modified is Importer.java, XMLUtil.java.
Right now the stack trace when the OAI Toolkit crash takes place looks
something
like:
........2009-07-01 18:11:46,906 [main] (Importer.java:492) INFO - [PRG] Modify
statistics for 0_uiu_bibs_2_70000.xml: converted: 10000, invalid: 0 records. It
took
00:00:10.084
2009-07-01 18:11:47,897 [main] (Importer.java:744) INFO - [PRG] This is a
valid
MARCXML file.
2009-07-01 18:11:47,897 [main] (Importer.java:435) INFO - [PRG] Modifying
records...
.......... (10%)
.......... (20%)
.....[Fatal Error] :172386:6: The content of elements must consist of
well-formed
character data or markup.
org.marc4j.MarcException: Unable to parse input
at org.marc4j.MarcXmlParser.parse(MarcXmlParser.java:95)
at org.marc4j.MarcXmlParser.parse(MarcXmlParser.java:64)
at org.marc4j.MarcXmlParserThread.run(MarcXmlParserThread.java:115)
Caused by: org.xml.sax.SAXParseException: The content of elements must consist
of
well-formed character data or markup.
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse
(AbstractSAXParser.java:1231)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse
(SAXParserImpl.java:522)
at org.marc4j.MarcXmlParser.parse(MarcXmlParser.java:93)
... 2 more
Original comment by sva...@library.rochester.edu
on 10 Jul 2009 at 5:52
There was a bug here, that the marc-xml file was not getting validated before
loading in the OAI Toolkit, which caused the marc4j to crash. The change has
been
done, so that the file validates in 2 ways:
1. Checks for its well-formedness.
2. Then it validates against the schema.
If it passes through these properly, it gets loaded in the OAI Toolkit.
Otherwise the appropriate error is shown to the user in the logs.
Original comment by sva...@library.rochester.edu
on 17 Nov 2009 at 2:58
Incorporated in the 0.6.3 version of the software.
Original comment by sva...@library.rochester.edu
on 17 Nov 2009 at 4:16
Original issue reported on code.google.com by
sva...@library.rochester.edu
on 10 Jul 2009 at 5:49