QT4 Tests without counterpart in the specs

ChristianGruen commented 1 year ago

The following functions are not defined in the current spec:

fn:unparcel, fn:parcel → dropped
fn:xdm-to-json → #576
fn:concat() → see #701
fn:parts → see #463
codepoints-to-string(), etc. (sequence-values arguments)

benibela commented 1 year ago

and then there was the discussion about laziness/generators which is still on going.

But these tests kind of assume there already is laziness:


RangeExpr-408f  count(1 to 100000000000)
RangeExpr-408g  count(reverse(1 to 100000000000))
RangeExpr-408h  1 = 1 to 100000000000
RangeExpr-408i  1 = reverse(1 to 100000000000)
RangeExpr-408j  1 < reverse(1 to 100000000000)
RangeExpr-408k  (1 to 100000000000)[100000000000]
RangeExpr-410f  count(-100000000000 to -1)
RangeExpr-410g  count(reverse(-100000000000 to -1))
RangeExpr-410h  -1 = -100000000000 to -1
RangeExpr-410i  -1 = reverse(-100000000000 to -1)
RangeExpr-410j  -1 > reverse(-100000000000 to -1)
RangeExpr-410k  (-100000000000 to -1)[100000000000]

ChristianGruen commented 1 year ago

But these tests kind of assume there already is laziness:

Indeed there is quite a bunch of tests that assume certain optimizations to be done, no matter how they look like, to be run in a reasonable time frame. I’ve observed that when implementing the features years ago by myself. My decision back then was to look for optimizations instead of reporting the tests back, which might have been a better choice.

I guess the referenced tests have all been added by me. I’d be happy to remove/modify them if there’s consensus about it.

@benibela Have you observed some more tests that take a long time to be evaluated and that could be tweaked?

michaelhkay commented 1 year ago

When we run tests, we have an exceptions file that lists tests that we don't run, with reasons, and one reason for not running a test is that it takes too long, or breaks limits in the product such as the length of a sequence. So we have in effect a shopping list of optimizations that we know we haven't implemented. I'm perfectly happy to have tests like that in the test suite; they are useful even if we can't pass them today.

ChristianGruen commented 1 year ago

When we run tests, we have an exceptions file that lists tests that we don't run, with reasons, and one reason for not running a test is that it takes too long, or breaks limits in the product such as the length of a sequence.

Feel free to share that list! We try to run all tests (provided we support the features), but if we notice that none of us reading this manages to evaluate the tests, I might be better to drop them out of the test suite…

michaelhkay commented 1 year ago

I've just done a pass on the changes.xml file in the qt4tests directory (currently there's no equivalent for the XSLT tests) ensuring that there's an entry for every substantive PR. The entries now cross-reference the PR where possible. The introspection test Catalog014 checks that every 4.0 test case links to an entry in the changes.xml file, while test Catalog015 checks that every entry in the changes.xml file is referenced by a set of test cases. Currently of course coverage is incomplete, but this mechanism allows us to track the coverage.

michaelhkay commented 1 year ago

Here's the current coverage data (number of tests by feature):

<r>
   <tests change="NameTestUnion"
          count="37"
          description="PR 606: Name test unions"/>
   <tests change="arity-coercion"
          count="7"
          description="PR 254: Reducing arity in coercion rules"/>
   <tests change="array-build"
          count="4"
          description="PR 360, 420: New function array:build"/>
   <tests change="array-foot"
          count="9"
          description="PR 250: New function array:foot"/>
   <tests change="array-get"
          count="7"
          description="PR 289: Extra argument to array:get"/>
   <tests change="array-members"
          count="6"
          description="PR 360, 420: New function array:members"/>
   <tests change="array-of-members"
          count="6"
          description="PR 420: New function array:of-members"/>
   <tests change="array-slice"
          count="71"
          description="PR 477: New function array:slice"/>
   <tests change="array-split"
          count="11"
          description="New function array:split"/>
   <tests change="array-trunk"
          count="6"
          description="PR 250: New function array:trunk"/>
   <tests change="array-values"
          count="0"
          description="PR 476: New function array:trunk"/>
   <tests change="binary-literals"
          count="8"
          description="PR 433, 456: Binary notation in numeric literals"/>
   <tests change="coercion-in-variables"
          count="30"
          description="Coercion rules are applied to variables"/>
   <tests change="constructors"
          count="0"
          description="PR 408, 658[b]: Changes to constructor functions"/>
   <tests change="defaulted-params"
          count="36"
          description="PR 166, 197, 375, 512: Default params in function declarations"/>
   <tests change="downcasting"
          count="10"
          description="Downcasting in coercion rules"/>
   <tests change="escape-solidus"
          count="3"
          description="PR 534: New serialization parameter escape-solidus"/>
   <tests change="extended-annotations"
          count="0"
          description="PR 682: Boolean and negative annotation values"/>
   <tests change="fn-QName"
          count="0"
          description="PR 207: arity-1 variant of fn:QName"/>
   <tests change="fn-all-different"
          count="31"
          description="New function fn:all-different"/>
   <tests change="fn-all-equal"
          count="30"
          description="New function fn:all-equal"/>
   <tests change="fn-atomic-equal"
          count="29"
          description="PR 319: New function fn:atomic-equal"/>
   <tests change="fn-build-uri"
          count="39"
          description="PR 215, 245, 347, 415: New function fn:build-uri"/>
   <tests change="fn-char"
          count="34"
          description="PR 261, 306: New function fn:char"/>
   <tests change="fn-characters"
          count="12"
          description="New function fn:characters"/>
   <tests change="fn-codepoints-to-string"
          count="7"
          description="Function fn:codepoints-to-string becomes variadic (TODO???)"/>
   <tests change="fn-concat"
          count="14"
          description="Function fn:concat allows sequence-valued arguments (TODO???)"/>
   <tests change="fn-contains-sequence"
          count="31"
          description="PR 243: New function fn:contains-sequence"/>
   <tests change="fn-decode-from-uri"
          count="0"
          description="PR 631: New function fn:decode-from-uri"/>
   <tests change="fn-deep-equal"
          count="85"
          description="PR 320, 396, 543: New options argument to function fn:deep-equal"/>
   <tests change="fn-doc"
          count="0"
          description="PR 430: fn:doc error handling"/>
   <tests change="fn-duplicate-values"
          count="126"
          description="PR 614: New function fn:duplicate-values"/>
   <tests change="fn-ends-with-sequence"
          count="31"
          description="PR 243: New function fn:ends-with-sequence"/>
   <tests change="fn-every"
          count="18"
          description="PR 140, 152: New function fn:every"/>
   <tests change="fn-expanded-QName"
          count="9"
          description="PR 207: New function fn:expanded-QName"/>
   <tests change="fn-foot"
          count="8"
          description="PR 250: New function fn:foot"/>
   <tests change="fn-format-integer"
          count="17"
          description="Changes to fn:format-integer, for example hex and binary output"/>
   <tests change="fn-format-number"
          count="2"
          description="Changes to fn:format-number, decimal format supplied as QName"/>
   <tests change="fn-highest"
          count="32"
          description="New function fn:highest"/>
   <tests change="fn-identity"
          count="5"
          description="New function fn:identity"/>
   <tests change="fn-in-scope-namespaces"
          count="9"
          description="New function fn:in-scope-namespaces"/>
   <tests change="fn-index-where"
          count="15"
          description="PR 258: New function fn:items-where"/>
   <tests change="fn-intersperse"
          count="13"
          description="PR 163: New function fn:intersperse"/>
   <tests change="fn-is-NaN" count="16" description="New function fn:is-NaN"/>
   <tests change="fn-items-after"
          count="9"
          description="PR 199: New function fn:items-after"/>
   <tests change="fn-items-at"
          count="24"
          description="PR 249: New function fn:items-at"/>
   <tests change="fn-items-before"
          count="9"
          description="PR 199: New function fn:items-before"/>
   <tests change="fn-items-ending-where"
          count="17"
          description="PR 199: New function fn:items-ending-where"/>
   <tests change="fn-items-starting-where"
          count="17"
          description="PR 199: New function fn:items-starting-where"/>
   <tests change="fn-iterate-while"
          count="16"
          description="PR 210, 465: New function fn:iterate-while"/>
   <tests change="fn-json-doc"
          count="1"
          description="Extra options to fn:json-doc"/>
   <tests change="fn-load-xquery-module"
          count="0"
          description="PR 549: Changes to fn:load-xquery-module"/>
   <tests change="fn-log" count="1" description="PR 629: New function fn:log"/>
   <tests change="fn-lowest" count="32" description="New function fn:lowest"/>
   <tests change="fn-op" count="34" description="PR 198: New function fn:op"/>
   <tests change="fn-parcel"
          count="6"
          description="New function fn:parcel (TODO: drop???)"/>
   <tests change="fn-parse-QName"
          count="19"
          description="New function fn:parse-QName"/>
   <tests change="fn-parse-csv"
          count="0"
          description="PR 533: New function fn:parse-html"/>
   <tests change="fn-parse-html"
          count="1379"
          description="PR 259, 330: New function fn:parse-html"/>
   <tests change="fn-parse-integer"
          count="30"
          description="PR 434, 462: New function fn:parse-integer"/>
   <tests change="fn-parse-json"
          count="0"
          description="New options for function fn:parse-json"/>
   <tests change="fn-parse-uri"
          count="40"
          description="PR 215, 245, 347, 415: New function fn:parse-uri"/>
   <tests change="fn-partition"
          count="23"
          description="PR 454, 507: New function fn:partition"/>
   <tests change="fn-parts"
          count="94"
          description="New function fn:parts (TODO: not currently in spec)"/>
   <tests change="fn-remove"
          count="6"
          description="PR 313: Changes to fn:remove (remove multiple items)"/>
   <tests change="fn-replace"
          count="0"
          description="PR 612: Changes to fn:replace (new substitute argument)"/>
   <tests change="fn-replicate"
          count="21"
          description="New function fn:replicate"/>
   <tests change="fn-resolve-uri"
          count="2"
          description="PR 424, 426: Changes to fn:resolve-uri (empty sequence in arg 2; fragment id)"/>
   <tests change="fn-slice" count="73" description="New function fn:slice"/>
   <tests change="fn-some"
          count="18"
          description="PR 140, 152: New function fn:some"/>
   <tests change="fn-sort" count="0" description="PR 623: fn:sort descending"/>
   <tests change="fn-starts-with-sequence"
          count="31"
          description="PR 243: New function fn:starts-with-sequence"/>
   <tests change="fn-transform"
          count="2"
          description="Changes to function fn:transform PR 427: Changes to function fn:transform"/>
   <tests change="fn-transitive-closure"
          count="15"
          description="PR 521: Changes to function fn:transform"/>
   <tests change="fn-trunk"
          count="8"
          description="New function fn:trunk PR 250: New function fn:trunk"/>
   <tests change="fn-unparcel"
          count="11"
          description="New function fn:unparcel (TODO: drop???)"/>
   <tests change="fn-void"
          count="1"
          description="PR 575: New function fn:void"/>
   <tests change="fn-xdm-to-json"
          count="152"
          description="Extra options to fn:xdm-to-json"/>
   <tests change="fn-xml-to-json"
          count="4"
          description="Changes to function fn:xml-to-json"/>
   <tests change="hex-literals"
          count="23"
          description="PR 433, 456: Hex notation in numeric literals"/>
   <tests change="if-curlies"
          count="42"
          description="PR 284: Curly braces in if expression"/>
   <tests change="keywords"
          count="289"
          description="Keywords in static function calls"/>
   <tests change="map-build"
          count="33"
          description="PR 203, 420: New function map:build"/>
   <tests change="map-entries"
          count="0"
          description="PR 420: New function map:entries"/>
   <tests change="map-filter"
          count="14"
          description="Changes to function map:filter"/>
   <tests change="map-get"
          count="6"
          description="PR 289: Extra argument to map:get"/>
   <tests change="map-group-by"
          count="0"
          description="New function map:group-by (TODO: status ???)"/>
   <tests change="map-keys"
          count="3"
          description="PR 478, 515: Extra argument to map:keys"/>
   <tests change="map-of-pairs"
          count="18"
          description="PR 360, 420: New function map:of-pairs"/>
   <tests change="map-pair"
          count="0"
          description="PR 420: New function map:pair"/>
   <tests change="map-values"
          count="0"
          description="PR 360, 420: &gt;New function map:values"/>
   <tests change="meta-elements"
          count="0"
          description="PR 342: Revise serialization rules for meta elements"/>
   <tests change="misc-collation-optional"
          count="0"
          description="PR 590: $collation argument may be empty"/>
   <tests change="multiple-for"
          count="1"
          description="PR 28, 344: Multiple for clauses in an expression"/>
   <tests change="multiple-let"
          count="2"
          description="PR 28, 344: Multiple let clauses in an expression"/>
   <tests change="numeric-underscores"
          count="11"
          description="PR 433, 456: Underscores in numeric literals"/>
   <tests change="operator-symbols"
          count="29"
          description="PR 466, 544: Non-ASCII characters in operator tokens"/>
   <tests change="otherwise" count="7" description="Otherwise operator"/>
   <tests change="plausibility"
          count="0"
          description="PR 603: Implausible expressions"/>
   <tests change="prod-EnumerationType" count="14" description="enum() types"/>
   <tests change="prod-ForClause.member"
          count="34"
          description="For-member in for expressions"/>
   <tests change="prod-InlineFunctionExpr.focus"
          count="29"
          description="PR 524: abbreviated inline functions - focus notation"/>
   <tests change="prod-LambdaExpr"
          count="9"
          description="PR 550, 561: abbreviated inline functions - lambda notation"/>
   <tests change="prod-LocalUnionType"
          count="7"
          description="local union types"/>
   <tests change="prod-MappingArrow"
          count="49"
          description="Tests the =!&gt; operator"/>
   <tests change="prod-StringTemplate"
          count="53"
          description="PR 324: String templates"/>
   <tests change="prod-ThickArrow"
          count="0"
          description="PR 545: Inline functions after arrow operator"/>
   <tests change="prod-ThinArrow"
          count="0"
          description="PR 447: Thin arrow expressions"/>
   <tests change="prod-UnionNodeTest"
          count="12"
          description="PR 286: Union node test"/>
   <tests change="record-test" count="11" description="Record tests"/>
   <tests change="switch"
          count="8"
          description="PR 364, 587, 671: Changes to switch expressions"/>
   <tests change="try-catch-variables"
          count="0"
          description="PR 493: new variable for error information"/>
   <tests change="typeswitch-braces"
          count="1"
          description="Curly braces in typeswitch expressions"/>
   <tests change="window"
          count="25"
          description="PR 483: Changes to FLWOR window clause"/>
   <tests change="xs-string"
          count="0"
          description="PR 546, 643: Non-XML characters in strings"/>
</r>

benibela commented 1 year ago

@benibela Have you observed some more tests that take a long time to be evaluated and that could be tweaked?

Well, these range tests do not take a long time . My implementation quickly fails with a "sequence count must fit in a signed 32-bit integer " error. I would need to have 800 GB RAM to attempt to run them

When I run the tests, I exclude "same-key-023,same-key-024,RangeExpr-407c,RangeExpr-407d,RangeExpr-409c,RangeExpr-409d" for performance, which are probably still from xqts 3.1

Now I have updated zhe testgs and cannot load the catalog anymore because of type="spec" value="XQ30 X" in focus-function-019

michaelhkay commented 1 year ago

I guess the referenced tests have all been added by me.

A lot of the more difficult tests come from Tim Mills and Oliver Hallam at CBCL. I think it's good to have these "stretch" tests. No-one is obliged to run them if they don't want to.

michaelhkay commented 1 year ago

I've just checked our exceptions list for 4.0 tests and the majority are features or changes not yet implemented. But the tests

      RangeExpr-408g
      RangeExpr-408h
      RangeExpr-408i
      RangeExpr-408j
      RangeExpr-408k
      RangeExpr-410f
      RangeExpr-410g
      RangeExpr-410h
      RangeExpr-410i
      RangeExpr-410j
      RangeExpr-410k"

are excluded because they exceed system limits.

We also exclude a few tests because we're constrained by bugs in third-party products, e.g. validator.nu.

On the XSLT side we also have some tests classified as "slow" which we run less frequently: mainly the comprehensive tests of regex character categories.

ChristianGruen commented 1 year ago

But the tests are excluded because they exceed system limits.

Maybe I should add FOAR0002 as alternative result to these particular tests?

fn-matches-50 is another test that seems to fail with all implementations I’ve just tried. – But maybe I shouldn’t extend the scope of this issue too much, as it was originally meant to collect the tests that currently have no counterpart in our 4.0 drafts.

ChristianGruen commented 1 year ago

A lot of the more difficult tests come from Tim Mills and Oliver Hallam at CBCL. I think it's good to have these "stretch" tests. No-one is obliged to run them if they don't want to.

In the past, it seemed to be important (at least to some people) to prove that you are 100% compliant. None of us would seemingly be able to claim that for the given test suite, but probably that’s ok as the focus has changed?

michaelhkay commented 1 year ago

We certainly aim for 100% conformance, but there will always be a few problems with limits, resource usage, third-party dependencies, configuration issues, implementation-dependent features, etc that mean 100% conformance isn't the same thing as a 100% pass rate on the tests.

ChristianGruen commented 1 year ago

100% conformance isn't the same thing as a 100% pass rate on the tests.

You're certainly right. I had pages like the following one in mind that focused on the test suite conformance…

https://dev.w3.org/2006/xquery-test-suite/PublicPagesStagingArea/XQTSReportSimple_XQTS_1_0_2.html

…and implementors liked to stress that they pass lots of the tests to indicate that conformance is important to them.

Back then, we had one implementation (Saxon, of course) that passed 100% of the tests, and others were more or less close. Today, as I understand, no one would reach 100% anymore (good to know for me, I wasn't aware of that).

michaelhkay commented 1 year ago

I think those reports allowed you to report tests as passed, failed, or not run with a stated reason. Valid reasons included the test having dependencies that the product doesn't support (e.g schema-awareness), the test exceeding implementation-defined limits, or the test being subject to an open bug report. On that measure, with the exception of a tiny handful of tests where we have problems with third-party libraries such as ICU, I think we still believe we can achieve 100% compliance once the specs and the implementation have stablised sufficiently. However, we no longer have any requirement to convince W3C that conforming implementations exist, so I don't think we need to go through this exercise.

ChristianGruen commented 1 year ago

I’ve added references to existing issues to the original comment.

ChristianGruen commented 10 months ago

The current state of the test suite is a bit messy, but I assume it would take too much time to document it here, and keep it update, so I propose to close this issue.

ndw commented 10 months ago

The CG agreed to close this issue without action meeting 062

qt4cg / qtspecs

QT4 Tests without counterpart in the specs #693