eXist-db / exist

eXist Native XML Database and Application Platform
https://exist-db.org
GNU Lesser General Public License v2.1
428 stars 179 forks source link

matches(): non-compliant to the spec and buggy #59

Open rvdb opened 11 years ago

rvdb commented 11 years ago

eXist's implementation of matches(xs:string*, xs:string) differs w.r.t. the official matches(xs:string?, xs:string) function. As the cardinality for the first string parameter suggests, eXist accepts multiple strings as first argument for matches(), while this throws an error against the standard XPath function. This can be demonstrated by running following query with Saxon and eXist (tested with eXist-2.1):

  let $data :=
    <data>
      <entry n="1">
        <val>abc</val>
      </entry>
      <entry n="2">
        <val>xyz</val>
        <val>bcd</val>
      </entry>
      <entry n="3">
        <val>123</val>
        <val>haha</val>
      </entry>
      <entry n="4">
        <val>haha</val>
        <val>123</val>
      </entry>
  </data>
  return $data//entry[matches(val,'a')]

While Saxon throws an error ("XPTY0004: A sequence of more than one item is not allowed as the first argument of matches()"), eXist lets this pass. Yet, there seems to be a bug when the first argument of matches is a sequence of strings. Above query returns following results:

  <entry n="1">
    <val>abc</val>
  </entry>
  <entry n="4">
    <val>haha</val>
    <val>123</val>
  </entry>

While both entries 3 and 4 should be returned, 3 is omitted from the results. This illustrates that if the first argument to eXist's matches() function is a sequence, only the first item is evaluated.

This bug aside, the non-compliant behaviour is documented in the eXist function documentation at http://demo.exist-db.org/exist/apps/fundocs/view.html?uri=http://www.w3.org/2005/xpath-functions&location=java:org.exist.xquery.functions.fn.FnModule#matches.2. Still, I'm curious about what motivated this discrepancy from the official spec, and the decision to keep the eXist matches() function unchanged in the http://www.w3.org/2005/xpath-functions namespace? IMO, this makes it too easy to write non-compliant XQuery code.

dizzzz commented 10 years ago

@ljo @adamretter @wolfgangmm @shabanovd I trust Ron is right.....

adamretter commented 10 years ago

Yes, I think I asked @wolfgangmm about it in the past. If I recall correctly, he was concerned that too much depended on the broken functionality

dizzzz commented 10 years ago

I can imagine......

adamretter commented 9 years ago

The issue https://github.com/eXist-db/exist/issues/581 would also have an impact here.

adamretter commented 9 years ago

It is unlikely we are going to fix this as it is a useful optimisation. However, before we close this ticket we should add to the documentation that fn:matches, fn:contains, fn:starts-with and fn:ends-with are non-standard and update the descriptions of the function signatures to mention this also.

duncdrum commented 6 years ago

ok 2013… yay for docs

joewiz commented 6 years ago

Given the time that has passed since then and our discussions of improving spec conformance, I think we should consider taking advantage of an upcoming major version to fix this.

line-o commented 5 years ago

I would vote for standard compliance (in 5.0.0). This does not look like a useful "optimisation". Either match all elements of the sequence or throw an error.

adamretter commented 5 years ago

@joewiz is there a current position on what to do with this?

joewiz commented 5 years ago

@adamretter I can summarize the discussion above as follows: in principle, we would like to comply with the spec, but we fear that too much existing code depends upon the current implementation.

I, for one, would heartily welcome a PR bringing eXist into compliance with the spec. This PR would allow us to test our apps with to see if our fears are founded or not. Perhaps the changes required to code would be minimal, as we discovered with other changes we've made in preparation for 5.x so far.

Either way, I think it would be fair to officially deprecate the non-spec compliant code, to draw the line in the sand and allow us to change this in an upcoming major release of eXist.