phax / ph-schematron

Java Schematron library that supports XSLT and native application
Apache License 2.0
110 stars 36 forks source link

Lose curly brackets in matches function #137

Closed floyd8787 closed 1 year ago

floyd8787 commented 1 year ago

I use the match function in my mysch.sch file

<schema xmlns="http://purl.oclc.org/dsdl/schematron" queryBinding="xslt2">
<ns prefix="xsi" uri="http://www.w3.org/2001/XMLSchema-instance"/>
    <pattern>
        <rule context="tag1">
            <assert test="matches(tag2,'^[0-9]{4}$')">Invalid value</assert>
        </rule>
    </pattern>
</schema>

And I have test.xml file

<?xml version="1.0" encoding="utf-8"?>
<tag1>
    <tag2>33 99</tag2>
</tag1>

I wrote a test

    @Test
    public void testSh() throws Exception {
        DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
        DocumentBuilder db = dbf.newDocumentBuilder();
        Document document = db.parse(ShTest.class.getResourceAsStream("/test.xml"));

        IReadableResource schematronResource = new   ReadableResourceByteArray(IOUtils.toByteArray(ShTest.class.getResourceAsStream("/mysch.sch")));
        SchematronResourceSCH schematron = new SchematronResourceSCH(schematronResource);
        schematron.setUseCache(false);
        SchematronOutputType schematronOutputType = schematron.applySchematronValidationToSVRL(document, null);
        IErrorList validationErrors = SchematronHelper.convertToErrorList(schematronOutputType, null);
        SVRLResourceError svrlResourceError = (SVRLResourceError) validationErrors.get(0);

        Assert.assertEquals("matches(tag2,'^[0-9]{4}$')", svrlResourceError.getTest());
    }

But it's failed, because svrlResourceError.getTest() return matches(tag2,'^[0-9]4$'), lose curly brackets. Why? How to fix it ? My version ph-schematron is 5.2.0 Thx,

phax commented 1 year ago

I tested it with the latest Saxon HE version 11.4 and it works without a problem. The above test outputs:

<?xml version="1.0" encoding="UTF-8"?>
<svrl:schematron-output xmlns:svrl="http://purl.oclc.org/dsdl/svrl" title="" schemaVersion="">
  <svrl:ns-prefix-in-attribute-values prefix="xsi" uri="http://www.w3.org/2001/XMLSchema-instance" />
  <svrl:active-pattern document="[edited]\ph-schematron\ph-schematron-xslt\src\test\resources\issues\github137\test.xml" />
  <svrl:fired-rule context="tag1" />
  <svrl:failed-assert location="/tag1[1]" test="matches(tag2,'^[0-9]4$')">
    <svrl:text>Invalid value</svrl:text>
  </svrl:failed-assert>
</svrl:schematron-output>

My assumption is that you are running an old Saxon version of Xalan or some other non-compliant XSLT solution.... Please check your Maven dependencies

floyd8787 commented 1 year ago

I tested it with the latest Saxon HE version 11.4 and it works without a problem. The above test outputs:

<?xml version="1.0" encoding="UTF-8"?>
<svrl:schematron-output xmlns:svrl="http://purl.oclc.org/dsdl/svrl" title="" schemaVersion="">
  <svrl:ns-prefix-in-attribute-values prefix="xsi" uri="http://www.w3.org/2001/XMLSchema-instance" />
  <svrl:active-pattern document="[edited]\ph-schematron\ph-schematron-xslt\src\test\resources\issues\github137\test.xml" />
  <svrl:fired-rule context="tag1" />
  <svrl:failed-assert location="/tag1[1]" test="matches(tag2,'^[0-9]4$')">
    <svrl:text>Invalid value</svrl:text>
  </svrl:failed-assert>
</svrl:schematron-output>

My assumption is that you are running an old Saxon version of Xalan or some other non-compliant XSLT solution.... Please check your Maven dependencies

<svrl:failed-assert location="/tag1[1]" test="matches(tag2,'^[0-9]4$')"> see test attribute value, regex lose curly brackets

phax commented 1 year ago

Ah, you mean a missing curly brace :)

But this regular expression is also good: '^[0-9]4$ It matches the following values: 04, 14, 24, ... 94

floyd8787 commented 1 year ago

Yes curly braces :) I expect value that matches ^[0-9]{4}$. For example: 2222, 4509. I.e curly braces are reqiured. I use curly braces in mysch.sch, but svrlResourceError.getTest() return without braces

phax commented 1 year ago

Ahhhhhhh........ Looking at it

phax commented 1 year ago

With the pure version the result is correctly - so somewhere in the transformation XSLT

phax commented 1 year ago

With SchXslt it also works without an issue

floyd8787 commented 1 year ago

Yes, validation works without an issue. I speak about method SVRLResourceError#getTest(). Result of this method not equals original expression изображение

phax commented 1 year ago

This is some XSLT magic I don't understand.

It seems like we're talking about "value templates" here, according to https://www.w3.org/TR/xslt-30/#value-templates and there it is forbidden to use single curly braces. I would need some intel from an XSLT expert here.

Anyway, by using the "SchXslt" version, you should be getting the correct result, because SchXslt is not using XSLT to provide the value but passes the value natively in the XSLT.

ISO Schematron stub created XSLT:

<xsl:template name="process-assert">
    <xsl:param name="test"/>
...
    <svrl:failed-assert test="{$test}" >
...
</xsl:template>

SchXslt created XSLT:

  <template match="tag1" mode="w13aab3" priority="0">
...
          <if test="not(matches(tag2,'^[0-9]{4}$'))">
            <ns1:failed-assert xmlns:ns1="http://purl.oclc.org/dsdl/svrl" location="{schxslt:location(.)}">
              <attribute name="test">matches(tag2,'^[0-9]{4}$')</attribute>
              <ns1:text>Invalid value</ns1:text>
            </ns1:failed-assert>
          </if>
...
  </template>

Sorry :(

phax commented 1 year ago

If anyone has a smart proposal on how to resolve this, I'd be happy, I am out of ideas. Closing this for now.