phax / ph-schematron

Java Schematron library that supports XSLT and native application
Apache License 2.0
115 stars 36 forks source link

Inconsistent XPath interpreter? #6

Closed R00dRallec closed 9 years ago

R00dRallec commented 9 years ago

Hi,

I'm currently struggeling with the XPath interpreter. Do you use different ones? In my main project, i get this exception: javax.xml.transform.TransformerException: Could not find function: doc

In another project for testing other stuff (yeah i know JUnit in the same project would be better, need to change this) there is no exception for the function doc.

But if I use the function current() to refer to my context element, my main project is totally fine.

My testing project throws this exception: net.sf.saxon.trans.XPathException: Unknown system function curent()

I cant explain this. I'm using the same libraries in both projects. Versions: ph-schematron 2.9.1 ph-commons 5.3.0 Saxon-He-9.6.0-4

Thanks in advance.

phax commented 9 years ago

Are you sure it "curent" and not "current" (with double "r")? And in the other project, please check if you have also Xalan in your classpath. Mixing Xalan and Saxon seems to be difficult. Btw. are you referring to the following "doc" function: http://www.xsltfunctions.com/xsl/fn_doc.html ?? Or which one do you mean?

R00dRallec commented 9 years ago

I made a typo, but current with double "r" gives another error: net.sf.saxon.trans.XPathException: System function current#0 is not available with this host language/version

Yes in my "main" project i have Xalan in my classpath, but not in the testing project.

Yes thats the desired function.

phax commented 9 years ago

The problem with "current" is, that this is an XSLT function, but the pure Schematron implementation handles only XPath functions. Maybe you find a matching XPath2 function that solves your problem????

R00dRallec commented 9 years ago

Okay, but why have I been able to use this function with your pure Schematron implementation before? This is confusing, because the function worked once.

R00dRallec commented 9 years ago

I was able to create a workaround so i dont have to use current(). But I#m still confused, why the same Schematron rule is on time succesfully applied, and in my other project not. I removed Xalan from the project, but no change... Classpath: swing2swt.jar, lib/jdom-2.0.5/lib/jaxen-1.1.6.jar, lib/jdom-2.0.5/lib/xercesImpl.jar, lib/jdom-2.0.5/lib/xml-apis.jar, lib/jdom-2.0.5/jdom-2.0.5.jar, lib/apache-log4j-2.0-bin/log4j-api-2.0.jar, lib/apache-log4j-2.0-bin/log4j-core-2.0.jar, lib/commons-beanutils-1.9.2/commons-beanutils-1.9.2.jar, lib/commons-collections-3.2.1/commons-collections-3.2.1.jar, lib/commons-lang-2.6/commons-lang-2.6.jar, lib/commons-lang3-3.3.2/commons-lang3-3.3.2.jar, lib/commons-logging-1.2/commons-logging-1.2.jar, lib/commons-configuration-1.10/commons-configuration-1.10.jar, lib/slf4j-1.7.7/slf4j-simple-1.7.7.jar, lib/slf4j-1.7.7/slf4j-api-1.7.7.jar, lib/commons-codec-1.9/commons-codec-1.9.jar, lib/commons-io-2.4/commons-io-2.4.jar, lib/pdfbox/fontbox-1.8.6.jar, lib/pdfbox/pdfbox-1.8.6.jar, lib/pdfbox/preflight-1.8.6.jar, lib/pdfbox/xmpbox-1.8.6.jar, lib/apache-log4j-2.0-bin/log4j-1.2-api-2.0.jar, lib/schematron/ph-commons-5.3.0.jar, lib/schematron/ph-schematron-2.9.0.jar, lib/saxon/Saxon-HE-9.6.0-4.jar, lib/apache-fop/avalon-framework-4.2.0.jar, lib/apache-fop/batik-all-1.7.jar, lib/apache-fop/fop.jar, lib/apache-fop/serializer-2.7.0.jar, lib/apache-fop/xercesImpl-2.7.1.jar, lib/apache-fop/xml-apis-1.3.04.jar, lib/apache-fop/xml-apis-ext-1.3.04.jar, lib/apache-fop/xmlgraphics-commons-1.5.jar

phax commented 9 years ago

Weird - I tried to make a minimum example and got the error, that "current" is an unknown function. According to http://stackoverflow.com/questions/3657745/xpath-query-how-to-refer-to-the-current-node-java-saxon this is clearly no XPath and they are suggesting some work around.

What I stumbled upon is, that when updating from Saxon 9.5 to 9.6 it was necessary to prefix all XPath functions with the "fn:" prefix to work.

And btw. your project still used ph-schematron 2.9.0 ;-)

P.S. concerning the German page - no problem, I'm a German native speaker ;-)

R00dRallec commented 9 years ago

Yes it's not XPath, my fault. Updating my Project to 2.9.1 magically resovled the issue and saxon is used to compile the XPath expression. I rewrote my rule, so i don't need the current element any longer. But still weird.

Looking at the stack trace showed, that before my update, com.sun.org.apache.xpath.internal.compiler.XPathParser was used to compile the XPath.

phax commented 9 years ago

There was a classloader issue in 2.9.0 using the system classloader instead of the context classloader. Thats why the update worked :)

R00dRallec commented 9 years ago

Ah nice, but do you have any idea how i could implement the xslt function current() ? I need this, because the context element has the same attribute like the checked ones from the assertion. Setting a variable like this: <sch:let name"current" value="."/> and accessing like this: test="//Test/@Id = $current/@Id" did not work as supposed.

phax commented 9 years ago

Can you do a nested query as suggested in the SO entry like /import/record[id/text() = ./parent/text()]/data ???

R00dRallec commented 9 years ago

No it does not work, it's interpreted as if i would try to check the node in the assertion and not an attribute from the context itself.

phax commented 9 years ago

What if you move the attribute from the context to the assert itself. So instead of context /element/@attr you could write: context /element and assertion (@attr and condition) or (not @attr)

R00dRallec commented 9 years ago

Not possible due to nested conditions. According to this page: http://www.schematron.com/iso/P8.html the let element is calculated in the current context: "A declaration of a named variable. If the let element is the child of a rule element, the variable is calculated and scoped to the current rule and context. Otherwise, the variable is calculated with the context of the instance document root. The required name attribute is the name of the variable. The required value attribute is an expression evaluated in the current context."

So i created this example: http://pastebin.com/1wraX0P8

http://pastebin.com/ktChYAvi

The assertion fails 3 times. Am i too dumb or is this a bug? Thanks again for your effort!

phax commented 9 years ago

You are right - this is a bug. Currently all lets seem to be evaluated in global scope only. More information to come.

The scoping of the variables is correct. But let ist just a text replacement.

Concerning the example you posted, the following worked for me:

<?xml version="1.0" encoding="UTF-8"?>
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron">
  <sch:pattern name="Every file is referenced to one doc">
    <sch:rule context="//reference">
      <sch:assert test="count(@To_File = //file/@Id) = 1">
        file <sch:value-of select="@To_File" /> is not referenced once but <sch:value-of select="count(@To_File = //file/@Id)" />
      </sch:assert>
    </sch:rule>
  </sch:pattern>
</sch:schema>

So it also works the other way around:

<?xml version="1.0" encoding="UTF-8"?>
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron">
  <sch:pattern name="Every file is referenced to one doc">
    <sch:rule context="//file">
      <sch:assert test="count(@Id = //reference/@To_File) = 1">
        file <sch:value-of select="@Id" /> is not referenced once but <sch:value-of select="count(@Id = //reference/@To_File)" />
      </sch:assert>
    </sch:rule>
  </sch:pattern>
</sch:schema>

So reference the attribute of the current element on the left hand side :) And btw. try to avoid "//" for performance reasons. Can we close this issue?

R00dRallec commented 9 years ago

I can't get your solution running. I can change the attributes in the xml like i want, the assertion never fails. This cannot be correct :o

And according to schematron specification, "let" is not a text replacement but a calculated variable: " If the let element is the child of a rule element, the variable is calculated and scoped to the current rule and context." Or do i missunderstand the sentence?

phax commented 9 years ago

In the current implementation, the "let" handling works like the following: collect all variables in the correct scope. Perform all text replacements in the current XPath. Evaluate the XPath. That's what I mean with "just a text replacement". The tricky thing is just to get the scope right.

Concerning the example: there was a typo in the second example - but I do get your point

phax commented 9 years ago

The following check at least determines files that are not referenced:

<?xml version="1.0" encoding="UTF-8"?>
<sch:schema xmlns:sch="http://purl.oclc.org/dsdl/schematron">
  <sch:pattern name="Every file is referenced to one doc">
    <sch:rule context="//file">
      <sch:assert test="count(.[@Id = //reference/@To_File]) = 1">
file <sch:value-of select="@Id" /> is not referenced once but <sch:value-of select="count(.[@Id = //reference/@To_File])" /> times.
      </sch:assert>
    </sch:rule>
  </sch:pattern>
</sch:schema>

or even simpler:

    <sch:rule context="//file">
      <sch:assert test="@Id = //reference/@To_File">
        file <sch:value-of select="@Id" /> is not referenced at all
      </sch:assert>
    </sch:rule>
phax commented 9 years ago

I spend some time with, but couldn't find a suitable solution. When using the XSLT based version instead of the pure version I found a solution:

    <sch:rule context="//doc">
      <sch:let name="cur" value="." />
      <sch:assert test="count(//reference[@To_File = $cur/generic/reference/@To_File]) = 1">
        file <sch:value-of select="generic/reference/@To_File" /> referenced from document <sch:value-of select="@Id" /> is referenced <sch:value-of select="count(//reference[@To_File = $cur/generic/reference/@To_File])" /> times overall
      </sch:assert>
    </sch:rule>

So the let handling is different between Pure and XSLT. XSLT seems to really evaluate the variable in the current context and pass the evaluation result into the test expression.... If this is not an option please file a separate issue.

R00dRallec commented 9 years ago

Wow, using the AbstractSchematronResource as you did in your testcase allows me to use the xslt function current(). So there is no longer a need for the variable. But can i add my own MapBasedXPathFunctionResolver to an AbstractSchematronResource?

phax commented 9 years ago

As far as I know this is only possible with XPath. You may use extensive "let"s :)

R00dRallec commented 9 years ago

Thank you very much! This issue is closed now.