eXist-db / exist

eXist Native XML Database and Application Platform
https://exist-db.org
GNU Lesser General Public License v2.1
421 stars 179 forks source link

[BUG] fn:contains mishandles non-string types, allows invalid data types #4035

Open joewiz opened 3 years ago

joewiz commented 3 years ago

Describe the bug

eXist's implementation of numerous functions that perform operations on string values, such as fn:contains(), fn:starts-with(), fn:ends-with(), fn:matches(), fn:replace(), and fn:tokenize(), is mishandling inputs, allowing non-string types instead of raising errors about the invalid data types.

Taking fn:contains() for instance, the XQuery spec states that this function takes two parameters, both xs:string?. See https://www.w3.org/TR/xpath-functions-31/#func-contains. In the 1st example below, rules of Type Promotion do not allow promoting numbers like 1 to strings like "1". And given the signature of the function, there is no reason eXist should try to cast any of the parameters as xs:double, as it is doing in the 2nd example.

Expected behavior

contains("foo", 1) should raise an XPTY0004 type error that the 2nd argument must be of type xs:string (and does so in Saxon and BaseX), but eXist returns false().

contains(<foo>bar</foo>, 1) should raise the same type error about the 2nd argument not being of type xs:string (and does so in Saxon and BaseX), but eXist returns an error about the 1st argument not being castable to xs:double.

Same goes for the other standard functions that take strings, like fn:starts-with(), fn:ends-with(), fn:matches(), fn:replace(), and fn:tokenize().

To Reproduce

The following XQSuite test module contains variations of the above examples for fn:contains() and shows the failure described above. Same for fn:starts-with(), fn:ends-with(), fn:matches(), fn:replace(), and fn:tokenize().

Every other standard function from the fn: namespace that takes a parameter of type xs:string correctly raises the XPTY0004 error.

xquery version "3.1";

module namespace t="http://exist-db.org/xquery/test";

declare namespace test="http://exist-db.org/xquery/xqsuite";

declare
    %test:assertError("XPTY0004")
function t:contains1() {
    contains("foo", 1)
};

declare
    %test:assertError("XPTY0004")
function t:contains2() {
    contains(1, "foo")
};

declare
    %test:assertError("XPTY0004")
function t:contains3() {
    contains(<foo>bar</foo>, 1)
};

declare
    %test:assertError("XPTY0004")
function t:contains4() {
    contains(1, <foo>bar</foo>)
};

declare
    %test:assertError("XPTY0004")
function t:contains-token1() {
    contains-token("foo", 1)
};

declare
    %test:assertError("XPTY0004")
function t:contains-token2() {
    contains-token(1, "foo")
};

declare
    %test:assertError("XPTY0004")
function t:contains-token3() {
    contains-token(<foo>bar</foo>, 1)
};

declare
    %test:assertError("XPTY0004")
function t:contains-token4() {
    contains-token(1, <foo>bar</foo>)
};

declare
    %test:assertError("XPTY0004")
function t:starts-with1() {
    starts-with("foo", 1)
};

declare
    %test:assertError("XPTY0004")
function t:starts-with2() {
    starts-with(1, "foo")
};

declare
    %test:assertError("XPTY0004")
function t:starts-with3() {
    starts-with(<foo>bar</foo>, 1)
};

declare
    %test:assertError("XPTY0004")
function t:starts-with4() {
    starts-with(1, <foo>bar</foo>)
};

declare
    %test:assertError("XPTY0004")
function t:ends-with1() {
    ends-with("foo", 1)
};

declare
    %test:assertError("XPTY0004")
function t:ends-with2() {
    ends-with(1, "foo")
};

declare
    %test:assertError("XPTY0004")
function t:ends-with3() {
    ends-with(<foo>bar</foo>, 1)
};

declare
    %test:assertError("XPTY0004")
function t:ends-with4() {
    ends-with(1, <foo>bar</foo>)
};

declare
    %test:assertError("XPTY0004")
function t:matches1() {
    matches("foo", 1)
};

declare
    %test:assertError("XPTY0004")
function t:matches2() {
    matches(1, "foo")
};

declare
    %test:assertError("XPTY0004")
function t:matches3() {
    matches(<foo>bar</foo>, 1)
};

declare
    %test:assertError("XPTY0004")
function t:matches4() {
    matches(1, <foo>bar</foo>)
};

declare
    %test:assertError("XPTY0004")
function t:replace1() {
    replace("foo", 1, 2)
};

declare
    %test:assertError("XPTY0004")
function t:replace2() {
    replace(1, "foo", 2)
};

declare
    %test:assertError("XPTY0004")
function t:replace3() {
    replace(<foo>bar</foo>, 1, 2)
};

declare
    %test:assertError("XPTY0004")
function t:replace4() {
    replace(1, <foo>bar</foo>, 2)
};

declare
    %test:assertError("XPTY0004")
function t:tokenize1() {
    tokenize("foo", 1)
};

declare
    %test:assertError("XPTY0004")
function t:tokenize2() {
    tokenize(1, "foo")
};

declare
    %test:assertError("XPTY0004")
function t:tokenize3() {
    tokenize(<foo>bar</foo>, 1)
};

declare
    %test:assertError("XPTY0004")
function t:tokenize4() {
    tokenize(1, <foo>bar</foo>)
};

declare
    %test:assertError("XPTY0004")
function t:analyze-string1() {
    analyze-string("foo", 1)
};

declare
    %test:assertError("XPTY0004")
function t:analyze-string2() {
    analyze-string(1, "foo")
};

declare
    %test:assertError("XPTY0004")
function t:analyze-string3() {
    analyze-string(<foo>bar</foo>, 1)
};

declare
    %test:assertError("XPTY0004")
function t:analyze-string4() {
    analyze-string(1, <foo>bar</foo>)
};

declare
    %test:assertError("XPTY0004")
function t:string-to-codepoints1() {
    string-to-codepoints(1)
};

declare
    %test:assertError("XPTY0004")
function t:string-join1() {
    string-join("foo", 1)
};

declare
    %test:assertError("XPTY0004")
function t:string-length1() {
    string-length(1)
};

declare
    %test:assertError("XPTY0004")
function t:normalize-space1() {
    normalize-space(1)
};

declare
    %test:assertError("XPTY0004")
function t:normalize-unicode1() {
    normalize-unicode(1)
};

declare
    %test:assertError("XPTY0004")
function t:upper-case1() {
    upper-case(1)
};

declare
    %test:assertError("XPTY0004")
function t:lower-case1() {
    lower-case(1)
};

declare
    %test:assertError("XPTY0004")
function t:translate1() {
    translate(1, 2, 3)
};

declare
    %test:assertError("XPTY0004")
function t:substring1() {
    substring(1, 2, 3)
};

declare
    %test:assertError("XPTY0004")
function t:substring-before1() {
    substring-before("foo", 1)
};

declare
    %test:assertError("XPTY0004")
function t:substring-before2() {
    substring-before(1, "foo")
};

declare
    %test:assertError("XPTY0004")
function t:substring-before3() {
    substring-before(<foo>bar</foo>, 1)
};

declare
    %test:assertError("XPTY0004")
function t:substring-before4() {
    substring-before(1, <foo>bar</foo>)
};

declare
    %test:assertError("XPTY0004")
function t:substring-after1() {
    substring-after("foo", 1)
};

declare
    %test:assertError("XPTY0004")
function t:substring-after2() {
    substring-after(1, "foo")
};

declare
    %test:assertError("XPTY0004")
function t:substring-after3() {
    substring-after(<foo>bar</foo>, 1)
};

declare
    %test:assertError("XPTY0004")
function t:substring-after4() {
    substring-after(1, <foo>bar</foo>)
};

declare
    %test:assertError("XPTY0004")
function t:error1() {
    error(QName("foo", "bar"), 1)
};

declare
    %test:assertError("XPTY0004")
function t:trace1() {
    trace("foo", 1)
};

declare
    %test:assertError("XPTY0004")
function t:format-number1() {
    format-number(1, 2)
};

declare
    %test:assertError("XPTY0004")
function t:format-date1() {
    format-dateTime(current-date(), 1)
};

declare
    %test:assertError("XPTY0004")
function t:format-dateTime1() {
    format-dateTime(current-dateTime(), 1)
};

declare
    %test:assertError("XPTY0004")
function t:format-time1() {
    format-time(current-time(), 1)
};

declare
    %test:assertError("XPTY0004")
function t:compare1() {
    compare(1, "bar")
};

declare
    %test:assertError("XPTY0004")
function t:compare2() {
    compare("bar", 1)
};

declare
    %test:assertError("XPTY0004")
function t:codepoint-equal1() {
    codepoint-equal(1, "bar")
};

declare
    %test:assertError("XPTY0004")
function t:codepoint-equal2() {
    codepoint-equal("bar", 1)
};

declare
    %test:assertError("XPTY0004")
function t:parse-ietf-date1() {
    parse-ietf-date(1)
};

declare
    %test:assertError("XPTY0004")
function t:doc1() {
    doc(1)
};

declare
    %test:assertError("XPTY0004")
function t:doc-available1() {
    doc-available(1)
};

declare
    %test:assertError("XPTY0004")
function t:unparsed-text1() {
    unparsed-text(1)
};

declare
    %test:assertError("XPTY0004")
function t:unparsed-text-lines1() {
    unparsed-text-lines(1)
};

declare
    %test:assertError("XPTY0004")
function t:unparsed-text-available1() {
    unparsed-text-available(1)
};

declare
    %test:assertError("XPTY0004")
function t:environment-variable1() {
    environment-variable(1)
};

declare
    %test:assertError("XPTY0004")
function t:parse-xml1() {
    parse-xml(1)
};

declare
    %test:assertError("XPTY0004")
function t:parse-xml-fragment1() {
    parse-xml-fragment(1)
};

declare
    %test:assertError("XPTY0004")
function t:parse-json1() {
    parse-json(1)
};

declare
    %test:assertError("XPTY0004")
function t:json-doc1() {
    json-doc(1)
};

declare
    %test:assertError("XPTY0004")
function t:json-to-xml1() {
    json-to-xml(1)
};

declare
    %test:assertError("XPTY0004")
function t:load-xquery-module1() {
    load-xquery-module(1)
};

This test returns the following results:

<testsuite package="http://exist-db.org/xquery/test" timestamp="2021-09-12T17:12:56.255-04:00"
    tests="72" failures="24" errors="0" pending="0" time="PT0.033S">
    <testcase name="analyze-string1" class="t:analyze-string1"/>
    <testcase name="analyze-string2" class="t:analyze-string2"/>
    <testcase name="analyze-string3" class="t:analyze-string3"/>
    <testcase name="analyze-string4" class="t:analyze-string4"/>
    <testcase name="codepoint-equal1" class="t:codepoint-equal1"/>
    <testcase name="codepoint-equal2" class="t:codepoint-equal2"/>
    <testcase name="compare1" class="t:compare1"/>
    <testcase name="compare2" class="t:compare2"/>
    <testcase name="contains-token1" class="t:contains-token1"/>
    <testcase name="contains-token2" class="t:contains-token2"/>
    <testcase name="contains-token3" class="t:contains-token3"/>
    <testcase name="contains-token4" class="t:contains-token4"/>
    <testcase name="contains1" class="t:contains1">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>false</output>
    </testcase>
    <testcase name="contains2" class="t:contains2">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>false</output>
    </testcase>
    <testcase name="contains3" class="t:contains3">
        <failure
            message="Expected error XPTY0004, got: err:FORG0001 Invalid value for cast/constructor. cannot construct xs:double from 'bar'"
            type="failure-error-code-1"/>
    </testcase>
    <testcase name="contains4" class="t:contains4">
        <failure
            message="Expected error XPTY0004, got: err:FORG0001 Invalid value for cast/constructor. cannot construct xs:double from 'bar'"
            type="failure-error-code-1"/>
    </testcase>
    <testcase name="doc-available1" class="t:doc-available1"/>
    <testcase name="doc1" class="t:doc1"/>
    <testcase name="ends-with1" class="t:ends-with1">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>false</output>
    </testcase>
    <testcase name="ends-with2" class="t:ends-with2">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>false</output>
    </testcase>
    <testcase name="ends-with3" class="t:ends-with3">
        <failure
            message="Expected error XPTY0004, got: err:FORG0001 Invalid value for cast/constructor. cannot construct xs:double from 'bar'"
            type="failure-error-code-1"/>
    </testcase>
    <testcase name="ends-with4" class="t:ends-with4">
        <failure
            message="Expected error XPTY0004, got: err:FORG0001 Invalid value for cast/constructor. cannot construct xs:double from 'bar'"
            type="failure-error-code-1"/>
    </testcase>
    <testcase name="environment-variable1" class="t:environment-variable1"/>
    <testcase name="error1" class="t:error1"/>
    <testcase name="format-date1" class="t:format-date1"/>
    <testcase name="format-dateTime1" class="t:format-dateTime1"/>
    <testcase name="format-number1" class="t:format-number1"/>
    <testcase name="format-time1" class="t:format-time1"/>
    <testcase name="json-doc1" class="t:json-doc1"/>
    <testcase name="json-to-xml1" class="t:json-to-xml1"/>
    <testcase name="load-xquery-module1" class="t:load-xquery-module1"/>
    <testcase name="lower-case1" class="t:lower-case1"/>
    <testcase name="matches1" class="t:matches1">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>false</output>
    </testcase>
    <testcase name="matches2" class="t:matches2">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>false</output>
    </testcase>
    <testcase name="matches3" class="t:matches3">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>false</output>
    </testcase>
    <testcase name="matches4" class="t:matches4">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>false</output>
    </testcase>
    <testcase name="normalize-space1" class="t:normalize-space1"/>
    <testcase name="normalize-unicode1" class="t:normalize-unicode1"/>
    <testcase name="parse-ietf-date1" class="t:parse-ietf-date1"/>
    <testcase name="parse-json1" class="t:parse-json1"/>
    <testcase name="parse-xml-fragment1" class="t:parse-xml-fragment1"/>
    <testcase name="parse-xml1" class="t:parse-xml1"/>
    <testcase name="replace1" class="t:replace1">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>foo</output>
    </testcase>
    <testcase name="replace2" class="t:replace2">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>1</output>
    </testcase>
    <testcase name="replace3" class="t:replace3">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>bar</output>
    </testcase>
    <testcase name="replace4" class="t:replace4">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>1</output>
    </testcase>
    <testcase name="starts-with1" class="t:starts-with1">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>false</output>
    </testcase>
    <testcase name="starts-with2" class="t:starts-with2">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>false</output>
    </testcase>
    <testcase name="starts-with3" class="t:starts-with3">
        <failure
            message="Expected error XPTY0004, got: err:FORG0001 Invalid value for cast/constructor. cannot construct xs:double from 'bar'"
            type="failure-error-code-1"/>
    </testcase>
    <testcase name="starts-with4" class="t:starts-with4">
        <failure
            message="Expected error XPTY0004, got: err:FORG0001 Invalid value for cast/constructor. cannot construct xs:double from 'bar'"
            type="failure-error-code-1"/>
    </testcase>
    <testcase name="string-join1" class="t:string-join1"/>
    <testcase name="string-length1" class="t:string-length1"/>
    <testcase name="string-to-codepoints1" class="t:string-to-codepoints1"/>
    <testcase name="substring-after1" class="t:substring-after1"/>
    <testcase name="substring-after2" class="t:substring-after2"/>
    <testcase name="substring-after3" class="t:substring-after3"/>
    <testcase name="substring-after4" class="t:substring-after4"/>
    <testcase name="substring-before1" class="t:substring-before1"/>
    <testcase name="substring-before2" class="t:substring-before2"/>
    <testcase name="substring-before3" class="t:substring-before3"/>
    <testcase name="substring-before4" class="t:substring-before4"/>
    <testcase name="substring1" class="t:substring1"/>
    <testcase name="tokenize1" class="t:tokenize1">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>foo</output>
    </testcase>
    <testcase name="tokenize2" class="t:tokenize2">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>1</output>
    </testcase>
    <testcase name="tokenize3" class="t:tokenize3">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>bar</output>
    </testcase>
    <testcase name="tokenize4" class="t:tokenize4">
        <failure message="Expected error XPTY0004." type="failure-error-code-1"/>
        <output>1</output>
    </testcase>
    <testcase name="trace1" class="t:trace1"/>
    <testcase name="translate1" class="t:translate1"/>
    <testcase name="unparsed-text-available1" class="t:unparsed-text-available1"/>
    <testcase name="unparsed-text-lines1" class="t:unparsed-text-lines1"/>
    <testcase name="unparsed-text1" class="t:unparsed-text1"/>
    <testcase name="upper-case1" class="t:upper-case1"/>
</testsuite>

Context (please always complete the following information):

Additional context

joewiz commented 3 years ago

I'm associating this issue with eXist 6 as the fix would likely break code depending on the current buggy behavior.

joewiz commented 3 years ago

I've extended the report to include fn:starts-with() and fn:ends-with() - which suffer from the same bug in eXist.

joewiz commented 3 years ago

I've extended the report to include fn:matches(), fn:replace() and fn:analyze-string().

joewiz commented 3 years ago

I've extended the report to include every function in the fn: namespace that takes any xs:string parameter, the XQSuite results. After completing this more exhaustive test, the functions with the problem described above are:

  1. fn:contains()
  2. fn:starts-with()
  3. fn:ends-with()
  4. fn:matches()
  5. fn:replace()
  6. fn:tokenize()
line-o commented 2 years ago

Whoa! What a find @joewiz

line-o commented 2 years ago

Fixing this has the potential to break some applications and even libraries written for exist-db