phax / ph-schematron

Java Schematron library that supports XSLT and native application
Apache License 2.0
112 stars 36 forks source link

Support XML catalogs in Apache Ant Task #40

Closed stefan-jung closed 7 years ago

stefan-jung commented 7 years ago

It would be nice, if ph-schematron would support XML catalogs. It would be perfect, if it would use the common principle from the <xslt> task.

<xslt basedir="doc" destdir="build/doc" extension=".html" style="style/apache.xsl">
  <xmlcatalog refid="mycatalog"/>
</xslt>

<xslt basedir="doc" destdir="build/doc" extension=".html" style="style/apache.xsl">
   <xmlcatalog>
       <dtd publicId="-//ArielPartners//DTD XML Article V1.0//EN"
         location="com/arielpartners/knowledgebase/dtd/article.dtd"/>
   </xmlcatalog>
</xslt>

This would be very important to, e.g., work with DITA XML files, see grammar here: org.oasis-open.dita.v1_3

phax commented 7 years ago

Theoretically this is pretty straight forward but: Currently I only have EntityResolver/UrlResolver only for the Schematron parsing but not for the XML to be validated :| So this will need some time...

stefan-jung commented 7 years ago

Alright, I'll lean back and enjoy the show. 🍸 🏖

phax commented 7 years ago

Can you please check if ant-apache-resolver.jar is part of the default Ant setup. Thanks

phax commented 7 years ago

The next SNAPSHOT (after 14:41 UTC) should do the trick - please try than. I tested with the following setup:

    <schematron schematronFile="test.sch" 
                expectSuccess="true"
                schematronProcessingEngine="pure">
      <fileset dir=".">
        <include name="test.xml" />
      </fileset>
      <xmlcatalog>
        <dtd publicId="-//bla//DTD XML test//EN" location="test.dtd"/>
      </xmlcatalog>
    </schematron>
stefan-jung commented 7 years ago

build.xml

<?xml version="1.0" encoding="UTF-8"?>
<project basedir="." default="check">
  <loadproperties srcfile="test.properties"/>
  <!-- Create path -->
  <path id="phsch.path">
    <pathelement location="ph-schematron-ant-task-4.2.3-20170505.144759-8-jar-with-dependencies.jar"/>
    <pathelement location="resolver.jar"/>
  </path>
  <xmlcatalog id="dita.catalog">
    <catalogpath path="../../.dita/dita-ot/catalog-dita.xml"/>
  </xmlcatalog>
  <!-- Define <schematron> task -->
  <taskdef name="schematron" classname="com.helger.schematron.ant.Schematron" classpathref="phsch.path" />
  <target name="check">
    <schematron schematronFile="test.sch" expectSuccess="true">
      <fileset dir=".">
        <include name="*.dita" />
      </fileset>
      <xmlcatalog refid="dita.catalog"/>
    </schematron>
  </target>
</project>

Output

check:
[schematron] Successfully parsed Schematron file 'C:\Users\eike\Desktop\ph-schematron\test.sch'
[schematron] Validating XML file 'C:\Users\eike\Desktop\ph-schematron\test.dita' against Schematron rules from 'test.sch' expecting success
[schematron] [main] ERROR com.helger.commons.callback.exception.LoggingExceptionCallback - Error reading XML document: C:\Users\eike\Desktop\ph-schematron\topic.dtd (Das System kann die angegebene Datei nicht finden)
[schematron] java.io.FileNotFoundException: C:\Users\eike\Desktop\ph-schematron\topic.dtd (Das System kann die angegebene Datei nicht finden)
[schematron]    at java.io.FileInputStream.open0(Native Method)
[schematron]    at java.io.FileInputStream.open(Unknown Source)
[schematron]    at java.io.FileInputStream.<init>(Unknown Source)
[schematron]    at java.io.FileInputStream.<init>(Unknown Source)
[schematron]    at sun.net.www.protocol.file.FileURLConnection.connect(Unknown Source)
[schematron]    at sun.net.www.protocol.file.FileURLConnection.getInputStream(Unknown Source)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(Unknown Source)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(Unknown Source)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(Unknown Source)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(Unknown Source)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(Unknown Source)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(Unknown Source)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(Unknown Source)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(Unknown Source)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(Unknown Source)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
[schematron]    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
[schematron]    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(Unknown Source)
[schematron]    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown Source)
[schematron]    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
[schematron]    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown Source)
[schematron]    at com.helger.xml.serialize.read.DOMReader.readXMLDOM(DOMReader.java:331)
[schematron]    at com.helger.xml.serialize.read.DOMReader.readXMLDOM(DOMReader.java:181)
[schematron]    at com.helger.schematron.SchematronResourceHelper.getNodeOfSource(SchematronResourceHelper.java:119)
[schematron]    at com.helger.schematron.AbstractSchematronResource.getAsNode(AbstractSchematronResource.java:166)
[schematron]    at com.helger.schematron.AbstractSchematronResource.applySchematronValidationToSVRL(AbstractSchematronResource.java:243)
[schematron]    at com.helger.schematron.ant.Schematron._performValidation(Schematron.java:288)
[schematron]    at com.helger.schematron.ant.Schematron.execute(Schematron.java:449)
[schematron]    at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:292)
[schematron]    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[schematron]    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
[schematron]    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
[schematron]    at java.lang.reflect.Method.invoke(Unknown Source)
[schematron]    at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
[schematron]    at org.apache.tools.ant.Task.perform(Task.java:348)
[schematron]    at org.apache.tools.ant.Target.execute(Target.java:435)
[schematron]    at org.apache.tools.ant.Target.performTasks(Target.java:456)
[schematron]    at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1393)
[schematron]    at org.apache.tools.ant.Project.executeTarget(Project.java:1364)
[schematron]    at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
[schematron]    at org.apache.tools.ant.Project.executeTargets(Project.java:1248)
[schematron]    at org.apache.tools.ant.Main.runBuild(Main.java:851)
[schematron]    at org.apache.tools.ant.Main.startAnt(Main.java:235)
[schematron]    at org.apache.tools.ant.launch.Launcher.run(Launcher.java:280)
[schematron]    at org.apache.tools.ant.launch.Launcher.main(Launcher.java:109)

BUILD FAILED
C:\Users\eike\Desktop\ph-schematron\build.xml:15: Exception validating XML 'C:\Users\eike\Desktop\ph-schematron\test.dita' against Schematron rules from 'test.sch'. Technical details: IllegalArgumentException - Failed to read source javax.xml.transform.stream.StreamSource@26b3fd41 as XML from SystemID 'file:/C:/Users/eike/Desktop/ph-schematron/test.dita'
phax commented 7 years ago

So

[error] @ file:/C:/Users/eike/.dita/dita-ot/catalog-dita.xml(7:5) [SAX] Die Markup-Deklarationen, die in der Dokumenttypdeklaration enthalten sind bzw. auf die von der Dokumenttypdeklaration verwiesen wird, müssen ordnungsgemäß formatiert sein. (org.xml.sax.SAXParseException: Die Markup-Deklarationen, die in der Dokumenttypdeklaration enthalten sind bzw. auf die von der Dokumenttypdeklaration verwiesen wird, müssen ordnungsgemäß formatiert sein.)

It's none of my business ;)

stefan-jung commented 7 years ago

I'll check that again, but I just pointed it to the DITA default. Could you please have a look at the syntax of my build.xml? Should this work?

phax commented 7 years ago

Looks good to me. I made the experience that the Java XML parser is quite strict... Maybe you can post the respective part of .dita/dita-ot/catalog-dita.xml(7:5) Most likely I will continue responding on Monday :)

stefan-jung commented 7 years ago

This is the file catalog.xml

I'll switch the machine and try again. Thank you for the work and have a nice weekend.

phax commented 7 years ago

java.io.FileNotFoundException: C:\Users\eike\Desktop\ph-schematron\topic.dtd (Das System kann die angegebene Datei nicht finden)

Seems like the base path is not correct. Is "resolver.jar" the same as "ant-apache-resolver.jar"?

stefan-jung commented 7 years ago

tl;dr: It works with the resolver.jar from the oXygen XML /lib dir, but not with the ant-apache-resolver-1.9.7.jar. My Ant version is Apache Ant(TM) version 1.9.7 compiled on April 24 2016, so the library version matches the Ant version.


Test Files

test.dita

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE topic PUBLIC "-//OASIS//DTD DITA Topic//EN" "topic.dtd">
<topic id="topic">
    <title></title>
    <body>
        <p></p>
    </body>
</topic>

test.sch

<?xml version="1.0" encoding="UTF-8"?>
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2"
  xmlns:sqf="http://www.schematron-quickfix.com/validator/process">
  <sch:pattern>
    <sch:rule context="*[contains(@class, ' topic/topic ')]">
      <sch:assert test="buddy">Topics should have buddys, not bodys</sch:assert>
    </sch:rule>
  </sch:pattern>
</sch:schema>

Scenario 1: With resolver.jar

build.xml with resolver.jar

<?xml version="1.0" encoding="UTF-8"?>
<project basedir="." default="check">
  <loadproperties srcfile="test.properties"/>
  <path id="phsch.path">
    <pathelement location="ph-schematron-ant-task-4.2.3-20170505.144759-8-jar-with-dependencies.jar"/>
    <pathelement location="resolver.jar"/>
  </path>
  <xmlcatalog id="dita.catalog">
    <catalogpath>
      <pathelement location="org.oasis-open.dita.v1_3/catalog.xml"/>
    </catalogpath>
  </xmlcatalog>
  <taskdef name="schematron" classname="com.helger.schematron.ant.Schematron" classpathref="phsch.path"/>
  <target name="check">
    <schematron schematronFile="test.sch" expectSuccess="true">
      <xmlcatalog refid="dita.catalog"/>
      <fileset dir=".">
        <include name="*.dita"/>
      </fileset>
    </schematron>
  </target>
</project>

Output

check:
[schematron] Successfully parsed Schematron file '/home/stefan/Schreibtisch/schematron-test/test.sch'
[schematron] Validating XML file '/home/stefan/Schreibtisch/schematron-test/test.dita' against Schematron rules from 'test.sch' expecting success
[schematron] 1 failed Schematron assertions for XML file '/home/stefan/Schreibtisch/schematron-test/test.dita'
[schematron] [error] in /topic @ /home/stefan/Schreibtisch/schematron-test/test.dita Topics should have buddys, not bodys Test=buddy

Scenario 2: With ant-apache-resolver-1.9.7.jar

build.xml with ant-apache-resolver-1.9.7.jar

<?xml version="1.0" encoding="UTF-8"?>
<project basedir="." default="check">
  <loadproperties srcfile="test.properties"/>
  <path id="phsch.path">
    <pathelement location="ph-schematron-ant-task-4.2.3-20170505.144759-8-jar-with-dependencies.jar"/>
    <pathelement location="ant-apache-resolver-1.9.7.jar"/>
  </path>
  <xmlcatalog id="dita.catalog">
    <catalogpath>
      <pathelement location="org.oasis-open.dita.v1_3/catalog.xml"/>
    </catalogpath>
  </xmlcatalog>
  <taskdef name="schematron" classname="com.helger.schematron.ant.Schematron" classpathref="phsch.path"/>
  <target name="check">
    <schematron schematronFile="test.sch" expectSuccess="true">
      <xmlcatalog refid="dita.catalog"/>
      <fileset dir=".">
        <include name="*.dita"/>
      </fileset>
    </schematron>
  </target>
</project>

Output

check:
[schematron] Successfully parsed Schematron file '/home/stefan/Schreibtisch/schematron-test/test.sch'
[schematron] Validating XML file '/home/stefan/Schreibtisch/schematron-test/test.dita' against Schematron rules from 'test.sch' expecting success
Warning: XML resolver not found; external catalogs will be ignored
[schematron] [main] ERROR com.helger.commons.callback.exception.LoggingExceptionCallback - Error reading XML document: /home/stefan/Schreibtisch/schematron-test/topic.dtd (Datei oder Verzeichnis nicht gefunden)
[schematron] java.io.FileNotFoundException: /home/stefan/Schreibtisch/schematron-test/topic.dtd (Datei oder Verzeichnis nicht gefunden)
[schematron]    at java.io.FileInputStream.open0(Native Method)
[schematron]    at java.io.FileInputStream.open(FileInputStream.java:195)
[schematron]    at java.io.FileInputStream.<init>(FileInputStream.java:138)
[schematron]    at java.io.FileInputStream.<init>(FileInputStream.java:93)
[schematron]    at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:90)
[schematron]    at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:188)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.setupCurrentEntity(XMLEntityManager.java:623)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startEntity(XMLEntityManager.java:1304)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLEntityManager.startDTDEntity(XMLEntityManager.java:1270)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLDTDScannerImpl.setInputSource(XMLDTDScannerImpl.java:264)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.dispatch(XMLDocumentScannerImpl.java:1161)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$DTDDriver.next(XMLDocumentScannerImpl.java:1045)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl$PrologDriver.next(XMLDocumentScannerImpl.java:959)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
[schematron]    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505)
[schematron]    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:841)
[schematron]    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:770)
[schematron]    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
[schematron]    at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:243)
[schematron]    at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentBuilderImpl.java:339)
[schematron]    at com.helger.xml.serialize.read.DOMReader.readXMLDOM(DOMReader.java:331)
[schematron]    at com.helger.xml.serialize.read.DOMReader.readXMLDOM(DOMReader.java:181)
[schematron]    at com.helger.schematron.SchematronResourceHelper.getNodeOfSource(SchematronResourceHelper.java:119)
[schematron]    at com.helger.schematron.AbstractSchematronResource.getAsNode(AbstractSchematronResource.java:166)
[schematron]    at com.helger.schematron.AbstractSchematronResource.applySchematronValidationToSVRL(AbstractSchematronResource.java:243)
[schematron]    at com.helger.schematron.ant.Schematron._performValidation(Schematron.java:288)
[schematron]    at com.helger.schematron.ant.Schematron.execute(Schematron.java:449)
[schematron]    at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:293)
[schematron]    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[schematron]    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[schematron]    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[schematron]    at java.lang.reflect.Method.invoke(Method.java:498)
[schematron]    at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
[schematron]    at org.apache.tools.ant.Task.perform(Task.java:348)
[schematron]    at org.apache.tools.ant.Target.execute(Target.java:435)
[schematron]    at org.apache.tools.ant.Target.performTasks(Target.java:456)
[schematron]    at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1405)
[schematron]    at org.apache.tools.ant.Project.executeTarget(Project.java:1376)
[schematron]    at org.apache.tools.ant.helper.DefaultExecutor.executeTargets(DefaultExecutor.java:41)
[schematron]    at org.apache.tools.ant.Project.executeTargets(Project.java:1260)
[schematron]    at org.apache.tools.ant.Main.runBuild(Main.java:854)
[schematron]    at org.apache.tools.ant.Main.startAnt(Main.java:236)
[schematron]    at org.apache.tools.ant.launch.Launcher.run(Launcher.java:285)
[schematron]    at org.apache.tools.ant.launch.Launcher.main(Launcher.java:112)

BUILD FAILED
/home/stefan/Schreibtisch/schematron-test/build.xml:16: Exception validating XML '/home/stefan/Schreibtisch/schematron-test/test.dita' against Schematron rules from 'test.sch'. Technical details: IllegalArgumentException - Failed to read source javax.xml.transform.stream.StreamSource@242b836 as XML from SystemID 'file:/home/stefan/Schreibtisch/schematron-test/test.dita'

@phax great work, I'll close this issue, because this is probably not an issue in your code.

phax commented 7 years ago

I added another test with a couple of directories but it all works fine. Maybe you can try it again without any resolver JAR in the CP. And btw. because of some incompatible changes in the base library I will change the version to 4.3.0