phax / ph-ubl

Java library for reading and writing UBL 2.0, 2.1, 2.2, 2.3 and 2.4 documents
Apache License 2.0
110 stars 40 forks source link

Document type detection #65

Closed GediminasVaistai closed 1 month ago

GediminasVaistai commented 1 month ago

Hello,

Are there any functions that can detect the type of document I pass into "readXMLDOM" and automatically set the schema (e.g., invoice, catalogue)? Currently, I have to manually specify the document type before reading it (e.g., UBL21Marshaller.catalogue()).

// Read final Document aDoc = DOMReader.readXMLDOM (new ClassPathResource (sFilename), new DOMReaderSettings ().setSchema (UBL21Marshaller.catalogue ().getSchema ())); assertNotNull (sFilename, aDoc); final CatalogueType aUBLObject = UBL21Marshaller.catalogue ().read (aDoc); assertNotNull (sFilename, aUBLObject);

  // Validate
  IErrorList aErrors = UBL21Marshaller.catalogue ().validate (aUBLObject);
phax commented 1 month ago

Yes, that would be nice, but unfortunately each marshaller is tightly bound to a specific set of XML Schemas. As the UBL2xMarshaller classes are also able to directly read from a file, stream or resource. There is a an enumeration EUBL21DocumentType (also available for other UBL versions), that lists all the contained document types. With a little trickery you can easily identify the root element local name and the root element namespace URI required to use it. I assume that would already help you. Please let me know if it does, so that I can add it to the other versions as well and release a an update afterwards

GediminasVaistai commented 1 month ago

Thank you! The two new methods you created helped me identify and handle the document type I was trying to pass. I tried to compare the enumeration EUBL21DocumentType (root element's local name, root element's namespace URI) with the document I passed in. And when I found matching pairs, I used this code to get a schema and used it in the readXMLDOM() method to validate the document. ( XMLSchemaCache.getInstanceOfClassLoader(e.getClass().getClassLoader()).getSchema(e.getAllXSDResources()) | where e is a EUBL21DocumentType )

phax commented 1 month ago

Okay great. So I will the 2 methods for all UBL versions and create an updated release 9.0.3