schierlm / BibleMultiConverter

Converter written in Java to convert between different Bible program formats
Other
126 stars 32 forks source link

Error converting USFM -> USX3 #64

Closed shadow-light closed 2 years ago

shadow-light commented 2 years ago

Sorry another issue today :)

Getting this error come up when converting this USFM to USX3. I'm not sure whether its because the USFM has invalid markup or whether its something else...

It seems to not like lemma="ngic hae-gba" as part of a \w tag?

Error: Command failed: java "-Dbiblemulticonverter.paratext.usx.verseseparatortext= " -jar BibleMultiConverter.jar ParatextConverter USFM "sources/mpp_wbt/usfm" USX3 "dist/bibles/mpp_wbt/usx" "*.usx"
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by com.sun.xml.bind.v2.runtime.reflect.opt.Injector (file:BibleMultiConverter/lib/jaxb-impl-2.2.11.jar) to method java.lang.ClassLoader.defineClass(java.lang.String,byte[],int,int)
WARNING: Please consider reporting this to the maintainers of com.sun.xml.bind.v2.runtime.reflect.opt.Injector
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Exception in thread "main" javax.xml.bind.MarshalException
 - with linked exception:
[org.xml.sax.SAXParseException; lineNumber: 0; columnNumber: 0; cvc-pattern-valid: Value 'ngic hae-gba' is not facet-valid with respect to pattern '\S+' for type 'NonEmptyOrBlankString'.]
        at com.sun.xml.bind.v2.runtime.MarshallerImpl.write(MarshallerImpl.java:326)
        at com.sun.xml.bind.v2.runtime.MarshallerImpl.marshal(MarshallerImpl.java:251)
        at javax.xml.bind.helpers.AbstractMarshallerImpl.marshal(AbstractMarshallerImpl.java:138)
        at biblemulticonverter.format.paratext.USX3.doExportBook(USX3.java:386)
        at biblemulticonverter.format.paratext.AbstractParatextFormat.doExportBooks(AbstractParatextFormat.java:435)
        at biblemulticonverter.tools.ParatextConverter.run(ParatextConverter.java:41)
        at biblemulticonverter.Main.main(Main.java:58)
Caused by: org.xml.sax.SAXParseException; lineNumber: 0; columnNumber: 0; cvc-pattern-valid: Value 'ngic hae-gba' is not facet-valid with respect to pattern '\S+' for type 'NonEmptyOrBlankString'.
        at java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:204)
        at java.xml/com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.error(ErrorHandlerWrapper.java:135)
        at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:396)
        at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:327)
        at java.xml/com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:284)
        at java.xml/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator$XSIErrorReporter.reportError(XMLSchemaValidator.java:511)
        at java.xml/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.reportSchemaError(XMLSchemaValidator.java:3587)
        at java.xml/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.processOneAttribute(XMLSchemaValidator.java:3107)
        at java.xml/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.processAttributes(XMLSchemaValidator.java:3051)
        at java.xml/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.handleStartElement(XMLSchemaValidator.java:2286)
        at java.xml/com.sun.org.apache.xerces.internal.impl.xs.XMLSchemaValidator.startElement(XMLSchemaValidator.java:829)
        at java.xml/com.sun.org.apache.xerces.internal.jaxp.validation.ValidatorHandlerImpl.startElement(ValidatorHandlerImpl.java:570)
        at java.xml/org.xml.sax.helpers.XMLFilterImpl.startElement(XMLFilterImpl.java:551)
        at com.sun.xml.bind.v2.runtime.output.SAXOutput.endStartTag(SAXOutput.java:128)
        at com.sun.xml.bind.v2.runtime.output.ForkXmlOutput.endStartTag(ForkXmlOutput.java:106)
        at com.sun.xml.bind.v2.runtime.XMLSerializer.endAttributes(XMLSerializer.java:307)
        at com.sun.xml.bind.v2.runtime.XMLSerializer.childAsSoleContent(XMLSerializer.java:592)
        at com.sun.xml.bind.v2.runtime.ClassBeanInfoImpl.serializeRoot(ClassBeanInfoImpl.java:341)
        at com.sun.xml.bind.v2.runtime.property.ArrayReferenceNodeProperty.serializeListBody(ArrayReferenceNodeProperty.java:118)
        at com.sun.xml.bind.v2.runtime.property.ArrayERProperty.serializeBody(ArrayERProperty.java:159)
        at com.sun.xml.bind.v2.runtime.ClassBeanInfoImpl.serializeBody(ClassBeanInfoImpl.java:360)
        at com.sun.xml.bind.v2.runtime.XMLSerializer.childAsXsiType(XMLSerializer.java:696)
        at com.sun.xml.bind.v2.runtime.property.ArrayElementNodeProperty.serializeItem(ArrayElementNodeProperty.java:69)
        at com.sun.xml.bind.v2.runtime.property.ArrayElementProperty.serializeListBody(ArrayElementProperty.java:172)
        at com.sun.xml.bind.v2.runtime.property.ArrayERProperty.serializeBody(ArrayERProperty.java:159)
        at com.sun.xml.bind.v2.runtime.ClassBeanInfoImpl.serializeBody(ClassBeanInfoImpl.java:360)
        at com.sun.xml.bind.v2.runtime.XMLSerializer.childAsSoleContent(XMLSerializer.java:593)
        at com.sun.xml.bind.v2.runtime.ClassBeanInfoImpl.serializeRoot(ClassBeanInfoImpl.java:341)
        at com.sun.xml.bind.v2.runtime.XMLSerializer.childAsRoot(XMLSerializer.java:494)
        at com.sun.xml.bind.v2.runtime.MarshallerImpl.write(MarshallerImpl.java:323)
        ... 6 more
schierlm commented 2 years ago

Ah, it seems that the final USX3 schema allows whitespace in lemmas. Will push a commit today that updates the schema embedded in BibleMultiConverter to allow that too.

As a workaround, you could also use -Dbiblemulticonverter.skipxmlvalidation=true option (at the risk that the tool might output invalid XML).

shadow-light commented 2 years ago

Thanks for the quick fix! We're working on https://fetch.bible and this tool is helping a lot

schierlm commented 2 years ago

Good luck with your project!

If you are looking for some more recent/popular freely-licensed German translations:

The pages linked above are in German but probably easy enough to understand after machine translation. If you are not interested in these texts, that's fine too. It is probably easier to get the number of languges up than getting quality translations in less-niche languages.

shadow-light commented 2 years ago

Thanks Michael, that's really helpful, it would be great to have some modern German Bibles added