phax / ph-schematron

Java Schematron library that supports XSLT and native application
Apache License 2.0
110 stars 36 forks source link

Unable to use matches() function in SchematronResourcePure #144

Closed pradyumanaggarwal closed 1 year ago

pradyumanaggarwal commented 1 year ago

Using ph-schematron 4.0.0, matches() function is not working while writing Schematrons. "isValidSchematron" is giving false when validation Schematron file.

Schematron File : `<?xml version="1.0" encoding="UTF-8"?> <sch:schema xmlns="http://purl.oclc.org/dsdl/schematron" xmlns:sch="http://purl.oclc.org/dsdl/schematron" xmlns:sqf="urn:anything" queryBinding="xslt2" schemaVersion="ISO19757-3"> Schematron 1</sch:title>

The title must contain the word "Legal". Bold must be there in element ` Also, i am defining `queryBinding="xslt2"` to use matches() which comes under xslt2 function.
phax commented 1 year ago

hi @pradyumanaggarwal, Please use Version 7.0.1 as the latest version, instead of 4.0.0 :) The pure version cannot handle XSLT - it is only limited to XPath. So if you want to use XSLT expressions, you have to use the XSLT version. See https://github.com/phax/ph-schematron/wiki for more details.

hth

pradyumanaggarwal commented 1 year ago

even after using

queryBinding="xslt2"

Will this help to use XSLT2 functions in schematron Resource Pure ?

phax commented 1 year ago

No. There is no way to run XSLT functions with the "pure" implementation. You MUST switch to the other solution

pradyumanaggarwal commented 1 year ago
 String schematron = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
                "<sch:schema xmlns:sch=\"http://purl.oclc.org/dsdl/schematron\" queryBinding=\"xslt2\">\n" +
                "  <sch:title>Schematron 1</sch:title>\n" +
                "  <sch:pattern>\n" +
                "    <sch:rule context=\"title\"> \n" +
                "      <sch:assert test=\"b\"> Bold must be there in <sch:name/> element</sch:assert> \n" +
                "    </sch:rule>\n" +
                "  </sch:pattern>\n" +
                "</sch:schema>";
        File schematronFile = new File("schematronFile.sch");
        try {
            FileWriter writer = new FileWriter(schematronFile);
            writer.write(schematron);
            writer.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
String xmlFile = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
                "<!DOCTYPE topic PUBLIC \"-//OASIS//DTD DITA Composite//EN\" \"technicalContent/dtd/ditabase.dtd\">\n" +
                "<topic id=\"id168TG0I0RYF\">\n" +
                "  <title>Hi Legal done</title>\n" +
                "  <shortdesc>Content is provided for demonstration purposes only. <ph audience=\"Administrator\">Administrators and operators must read the manual before operating a new vehicle. </ph> <ph audience=\"EndUser\">Any user must read the manual before operating a new vehicle. </ph> <ph product=\"ProductA\">Your luxuriously appointed way to travel to the stars awaits you! </ph> <ph product=\"ProductB\">Your well appointed spaceship awaits you! </ph></shortdesc>\n" +
                "  <prolog>\n" +
                "  </prolog>\n" +
                "  <body>\n" +
                "    <p></p>\n" +
                "    <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua</p>\n" +
                "  </body>\n" +
                "</topic>";
        File XMLFile = new File("XMLFile.xml");
        try {
            FileWriter writer = new FileWriter(XMLFile);
            writer.write(xmlFile);
            writer.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
        final ISchematronResource aResSCH = SchematronResourceXSLT.fromFile (schematronFile);
        boolean ans = aResSCH.isValidSchematron();
        ISchematronResource schemaResourcePureCheck = SchematronResourcePure.fromFile(schematronFile);
        boolean ans2 = schemaResourcePureCheck.isValidSchematron();

I have tried using SchematronResourceXSLT but isValidSchematron giving false, but when using SchematronResourcePure, i am getting true. Is there anything else required ?

I have also included this dependency

<dependency>
    <groupId>com.helger.schematron</groupId>
    <artifactId>ph-schematron-xslt</artifactId>
    <version>6.1.0</version>
</dependency>

Can anyone help ? @phax

phax commented 1 year ago

The problem is, that you are trying SchematronResourceXSLT.fromFile - instead use SchematronResourceSCH.from... - see the above example code. SCH can be "precompiled" to "XSLT" for performance reasons (e.g. with the Maven plugin). In case you have such a precompiled file, you would use SchematronResourceXSLT, but in your case, when you have the SCH you should use SchematronResourceSCH.

hth

pradyumanaggarwal commented 1 year ago
String schematron = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
                "<sch:schema xmlns:sch=\"http://purl.oclc.org/dsdl/schematron\" queryBinding=\"xslt2\">\n" +
                "  <sch:title>Schematron 1</sch:title>\n" +
                "  <sch:pattern>\n" +
                "    <sch:rule context=\"title\"> \n" +
                "      <sch:assert test=\"b\"> Bold must be there in <sch:name/> element</sch:assert> \n" +
                "    </sch:rule>\n" +
                "  </sch:pattern>\n" +
                "</sch:schema>";
        File schematronFile = new File("schematronFile.sch");
        try {
            FileWriter writer = new FileWriter(schematronFile);
            writer.write(schematron);
            writer.close();
        } catch (IOException e) {
            e.printStackTrace();
        }

        final ISchematronResource aResSCHCheck1 = SchematronResourceSCH.fromFile(schematronFile);
        final ISchematronResource aResSCHCheck2 = SchematronResourceXSLT.fromFile(schematronFile);
        final ISchematronResource aResSCHCheck3 = SchematronResourcePure.fromFile(schematronFile);

        boolean ans1 = aResSCHCheck1.isValidSchematron();
        boolean ans2 = aResSCHCheck2.isValidSchematron();
        boolean ans3 = aResSCHCheck3.isValidSchematron();

Even if i try SCH implementation, i am getting isValidSchematron() => false, but Pure implementation gives me true. Is there anything missing ? @phax

phax commented 1 year ago

See the copy paste error:

   boolean ans1 = aResSCHCheck2.isValidSchematron();
    boolean ans2 = aResSCHCheck2.isValidSchematron();
    boolean ans3 = aResSCHCheck3.isValidSchematron();`

you are checking twice from 2

pradyumanaggarwal commented 1 year ago

Just fixed that was a typo, but still getting false when using SchematronResourceSCH.

phax commented 1 year ago

This code works for me in the test:

  @Test
  public void test2 ()
  {
    final String schematron = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
                              "<sch:schema xmlns:sch=\"http://purl.oclc.org/dsdl/schematron\" queryBinding=\"xslt2\">\n" +
                              "  <sch:title>Schematron 1</sch:title>\n" +
                              "  <sch:pattern>\n" +
                              "    <sch:rule context=\"title\"> \n" +
                              "      <sch:assert test=\"b\"> Bold must be there in <sch:name/> element</sch:assert> \n" +
                              "    </sch:rule>\n" +
                              "  </sch:pattern>\n" +
                              "</sch:schema>";
    final File schematronFile = new File ("schematronFile.sch");
    try
    {
      final FileWriter writer = new FileWriter (schematronFile);
      writer.write (schematron);
      writer.close ();
    }
    catch (final IOException e)
    {
      e.printStackTrace ();
    }

    final ISchematronResource aResSCHCheck1 = SchematronResourceSCH.fromFile (schematronFile);
    final ISchematronResource aResSCHCheck2 = SchematronResourceXSLT.fromFile (schematronFile);

    final boolean ans1 = aResSCHCheck1.isValidSchematron ();
    assertTrue (ans1);
    final boolean ans2 = aResSCHCheck2.isValidSchematron ();
    assertFalse (ans2);
  }
phax commented 1 year ago

Make sure to use ph-schematron 6.3.4 - this is the latest version of the 6.x series - maybe that makes a change. Otherwise: please make sure you use the matching Saxon HE version and not Xalan or other engines.

pradyumanaggarwal commented 1 year ago

In maven, you are providing 5.6.5 latest version of ph-schematron, right ? Link

phax commented 1 year ago

No, the latest version is 6.3.4. At some point I changed the group ID to be com.helger.schematron instead of com.helger User search.maven.orrg to search for ph-schematron-xslt

The last version is 7.1.0 for Saxon 12 and Java 11, 7.0.1 with Saxon 11 and Java 11 and 6.3.4 with Saxon 11 and Java 8

pradyumanaggarwal commented 1 year ago

Is it possible to get all failedAsserts and Successful Reports in a single run, currently the validation stops after first error.

SchematronOutputType svrl = aResSCHCheck.applySchematronValidationToSVRL (new StringStreamSource(xmlFile));

@phax

phax commented 1 year ago

Indeed there is: See SVRLHelper.getAllFailedAssertionsAndSuccessfulReports (SchematronOutputType) - it returns all failed assertions and successful reports. See other methods in this helper class as well for other, potentially more tailored APIs.

pradyumanaggarwal commented 1 year ago

It's an unusual thing but same code works in test file but not in main function when running by creating a build file. There might be a possible reason that file is not read at run time properly when getSchematronXSLTProvider() is called. Is there any way of passing Sch file through input stream, like we can do in SchematronResourcePure. I can't find fromInputStream() in SchematronResourceSch documentation.

final String schematron = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
                "<sch:schema xmlns:sch=\"http://purl.oclc.org/dsdl/schematron\" queryBinding=\"xslt2\">\n" +
                "  <sch:title>Schematron 1</sch:title>\n" +
                "  <sch:pattern>\n" +
                "    <sch:rule context=\"title\"> \n" +
                "      <sch:report test=\"b\"> Bold must be there in <sch:name/> element</sch:report> \n" +
                "    </sch:rule>\n" +
                "    <sch:rule context=\"title\">\n" +
                "      <sch:assert test=\"matches(., '^Hipe.*')\">Title must start with 'hello'</sch:assert>\n" +
                "    </sch:rule>\n" +
                "  </sch:pattern>\n" +
                "</sch:schema>";
        final File schematronFile = new File ("schematronFileAnother.sch");
        try {
            final FileWriter writer = new FileWriter (schematronFile);
            writer.write (schematron);
            writer.close ();
        } catch (final IOException e) {
            e.printStackTrace ();
        }

        final ISchematronResource aResSCHCheck1 = SchematronResourceSCH.fromFile (schematronFile);

        boolean valid = aResSCHCheck1.isValidSchematron();

@phax

phax commented 1 year ago

Most likely because you're not passing a Charset in the FileWriter constructor. Anyway, if you look at the factory methods of SchematronResourceSCH you can see fromInputStream - et voila. Additionally fromURL, fromString, fromByteArray, fromClassPath etc. are all there for your convenience :) Make sure to use 6.3.4 to get all of the variants.

pradyumanaggarwal commented 1 year ago

When using ph-schematron, I am using SchematronResourceSCH, but only first Rule is working in sch file. How to enable fired all rule and get all broken rules. In schematronResourcePure, getActivePatternAndFiredRuleAndFailedAssert() gives all broken rules. Because of some dependency issues, i can only use ph-schematron.

@Test
    public void test2 () throws Exception {

        final String schematron = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
                "<sch:schema xmlns:sch=\"http://purl.oclc.org/dsdl/schematron\" queryBinding=\"xslt2\">\n" +
                "  <sch:title>Schematron 1</sch:title>\n" +
                "  <sch:pattern>\n" +
                "    <sch:rule context=\"title\">\n" +
                "      <sch:assert test=\"matches(., '^Legal.*')\">Title must start with 'Legal'</sch:assert>\n" +
                "    </sch:rule>\n" +
                "    <sch:rule context=\"title\"> \n" +
                "      <sch:assert test=\"b\"> Bold must be there in <sch:name/> element</sch:assert> \n" +
                "    </sch:rule>\n" +
                "  </sch:pattern>\n" +
                "</sch:schema>";
        final File schematronFile = new File ("schematronFile.sch");
        try
        {
            final FileWriter writer = new FileWriter (schematronFile);
            writer.write (schematron);
            writer.close ();
        }
        catch (final IOException e)
        {
            e.printStackTrace ();
        }

        final ISchematronResource aResSCHCheck1 = SchematronResourceSCH.fromFile (schematronFile);
        aResSCHCheck1.setUseCache(false);
        boolean ans1 = aResSCHCheck1.isValidSchematron();
        System.out.println(ans1);
        Assert.assertTrue (ans1);

        final String xmlFile = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
                "<topic id=\"id168TG0I0RYF\">\n" +
                "  <title>Legal</title>\n" +
                "  <shortdesc>Content is provided for demonstration purposes only. <ph audience=\"Administrator\">Administrators and operators must read the manual before operating a new vehicle. </ph> <ph audience=\"EndUser\">Any user must read the manual before operating a new vehicle. </ph> <ph product=\"ProductA\">Your luxuriously appointed way to travel to the stars awaits you! </ph> <ph product=\"ProductB\">Your well appointed spaceship awaits you! </ph></shortdesc>\n" +
                "  <prolog>\n" +
                "  </prolog>\n" +
                "  <body>\n" +
                "    <p></p>\n" +
                "    <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua</p>\n" +
                "  </body>\n" +
                "</topic>";
        final SchematronOutputType svrl;
        try {
            svrl = aResSCHCheck1.applySchematronValidationToSVRL (new StringStreamSource(xmlFile));
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
        assertNotNull (svrl);
        List<Object> failedRules = svrl.getActivePatternAndFiredRuleAndFailedAssert();
        List<AssertionClass> errors = new ArrayList<>();
        for (Object object : failedRules) {
            if (object instanceof FailedAssert) {
                FailedAssert failedAssert = (FailedAssert) object;
                AssertionClass assertion = new AssertionClass(failedAssert.getTest(), failedAssert.getText(), failedAssert.getLocation());
                System.out.println(failedAssert.getText());

                errors.add(assertion);
            }
        }

        for (Object object : failedRules) {
            if (object instanceof SuccessfulReport) {
                SuccessfulReport successfulReport = (SuccessfulReport) object;
                AssertionClass assertion = new AssertionClass(successfulReport.getTest(), successfulReport.getText(), successfulReport.getLocation());
                System.out.println(successfulReport.getText());
                errors.add(assertion);
            }
        }
    }

If i switch order of rules here in sch file, then i will get Bold must be there in title element.

@phax

phax commented 1 year ago

That's how Schematron works. Due to its nature to convert Schematron to XSLT and the match making of XSLT, you need to make sure, each "context" element is present only once. So if you merge your Schematron like this, it should work (untested):

               "    <sch:rule context=\"title\">\n" +
                "      <sch:assert test=\"matches(., '^Legal.*')\">Title must start with 'Legal'</sch:assert>\n" +
                "      <sch:assert test=\"b\"> Bold must be there in <sch:name/> element</sch:assert> \n" +
                "    </sch:rule>\n" +

--> so use all the assertions you need in one rule, as long as the context matches

pradyumanaggarwal commented 1 year ago

Just trying a sch file, but getting a different type of error : *Also facing this issue, while using Abstract Patterns in sch files.

ERROR com.helger.xml.transform.LoggingTransformErrorListener - [fatal_error] Transformation fatal error (net.sf.saxon.trans.XPathException: A sequence of more than one item is not allowed as the first argument of fn:string-length() ("Content is provided for demonstrati...", " ") )

java.lang.RuntimeException: net.sf.saxon.trans.XPathException: A sequence of more than one item is not allowed as the first argument of fn:string-length() ("Content is provided for demonstrati...", " ") 

    at com.adobe.fmdita.schematron.SchematronServiceTest.test2(SchematronServiceTest.java:109)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
    at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
    at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
    at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
    at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
    at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48)
    at org.junit.rules.RunRules.evaluate(RunRules.java:20)
    at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
    at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
    at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
    at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
    at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
    at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
    at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
    at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
    at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
    at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:69)
    at com.intellij.rt.junit.IdeaTestRunner$Repeater$1.execute(IdeaTestRunner.java:38)
    at com.intellij.rt.execution.junit.TestsRepeater.repeat(TestsRepeater.java:11)
    at com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:35)
    at com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:235)
    at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:54)
Caused by: net.sf.saxon.trans.XPathException: A sequence of more than one item is not allowed as the first argument of fn:string-length() ("Content is provided for demonstrati...", " ") 

Do i need to configure any other attribute in sch file or property. Didn't find any proper solution to fix this. Code :

@Test
    public void test2 ()
    {
        final String schematron = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
                "<sch:schema xmlns:sch=\"http://purl.oclc.org/dsdl/schematron\" queryBinding=\"xslt2\">\n" +
                "  <sch:title>Schematron 1</sch:title>\n" +
                "  <sch:pattern>\n" +
                "    <sch:rule context=\"shortdesc\">\n" +
                "        <sch:let name=\"characters\" value=\"string-length(text())\"/>\n" +
                "        <sch:assert test=\"$characters &lt; 10\"> \n" +
                "        You have characters. Short Description characters should be less than 10.       \n" +
                "        </sch:assert>  \n" +
                "    </sch:rule>\n" +
                "  </sch:pattern>\n" +
                "</sch:schema>";
        final File schematronFile = new File ("schematronFile.sch");
        try
        {
            final FileWriter writer = new FileWriter (schematronFile);
            writer.write (schematron);
            writer.close ();
        }
        catch (final IOException e)
        {
            e.printStackTrace ();
        }

        final ISchematronResource aResSCHCheck1 = SchematronResourceSCH.fromFile (schematronFile);

        boolean ans1 = aResSCHCheck1.isValidSchematron ();
        System.out.println(ans1);
        Assert.assertTrue (ans1);

        final String xmlFile = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" +
                "<topic id=\"id168TG0I0RYF\">\n" +
                "  <title><b>Hi Legal done</b></title>\n" +
                "  <shortdesc>Content is provided for demonstration purposes only. <ph audience=\"Administrator\">Administrators and operators must read the manual before operating a new vehicle. </ph> <ph audience=\"EndUser\">Any user must read the manual before operating a new vehicle. </ph> <ph product=\"ProductA\">Your luxuriously appointed way to travel to the stars awaits you! </ph> <ph product=\"ProductB\">Your well appointed spaceship awaits you! </ph></shortdesc>\n" +
                "  <prolog>\n" +
                "  </prolog>\n" +
                "  <body>\n" +
                "    <p></p>\n" +
                "    <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua</p>\n" +
                "  </body>\n" +
                "</topic>";
        final SchematronOutputType svrl;
        try {
            svrl = aResSCHCheck1.applySchematronValidationToSVRL (new StringStreamSource(xmlFile));
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
        assertNotNull (svrl);
        List<Object> failedAsserts = svrl.getActivePatternAndFiredRuleAndFailedAssert();
        List<AssertionClass> errors = new ArrayList<>();

        for (Object object : failedAsserts) {
            if (object instanceof FailedAssert) {
                FailedAssert failedAssert = (FailedAssert) object;
                AssertionClass assertion = new AssertionClass(failedAssert.getTest(), failedAssert.getText(), failedAssert.getLocation());
                System.out.println(failedAssert.getText());

                errors.add(assertion);
            }
        }

        for (Object object : failedAsserts) {
            if (object instanceof SuccessfulReport) {
                SuccessfulReport successfulReport = (SuccessfulReport) object;
                AssertionClass assertion = new AssertionClass(successfulReport.getTest(), successfulReport.getText(), successfulReport.getLocation());
                System.out.println(successfulReport.getText());
                errors.add(assertion);
            }
        }
        System.out.println(errors.size());

    }

@phax

phax commented 1 year ago

I assume what you want is string-length(.) and not string-length(text()) - then it works.

Please check https://stackoverflow.com/questions/34593753/testing-text-nodes-vs-string-values-in-xpath for the difference between text() and . in XPath.

phax commented 1 year ago

I am closing the issue now - lets continue in discussions please. This is no longer an "issue" (at least for me ;-) )