Closed ChristianGruen closed 10 months ago
and then there was the discussion about laziness/generators which is still on going.
But these tests kind of assume there already is laziness:
RangeExpr-408f count(1 to 100000000000)
RangeExpr-408g count(reverse(1 to 100000000000))
RangeExpr-408h 1 = 1 to 100000000000
RangeExpr-408i 1 = reverse(1 to 100000000000)
RangeExpr-408j 1 < reverse(1 to 100000000000)
RangeExpr-408k (1 to 100000000000)[100000000000]
RangeExpr-410f count(-100000000000 to -1)
RangeExpr-410g count(reverse(-100000000000 to -1))
RangeExpr-410h -1 = -100000000000 to -1
RangeExpr-410i -1 = reverse(-100000000000 to -1)
RangeExpr-410j -1 > reverse(-100000000000 to -1)
RangeExpr-410k (-100000000000 to -1)[100000000000]
But these tests kind of assume there already is laziness:
Indeed there is quite a bunch of tests that assume certain optimizations to be done, no matter how they look like, to be run in a reasonable time frame. I’ve observed that when implementing the features years ago by myself. My decision back then was to look for optimizations instead of reporting the tests back, which might have been a better choice.
I guess the referenced tests have all been added by me. I’d be happy to remove/modify them if there’s consensus about it.
@benibela Have you observed some more tests that take a long time to be evaluated and that could be tweaked?
When we run tests, we have an exceptions file that lists tests that we don't run, with reasons, and one reason for not running a test is that it takes too long, or breaks limits in the product such as the length of a sequence. So we have in effect a shopping list of optimizations that we know we haven't implemented. I'm perfectly happy to have tests like that in the test suite; they are useful even if we can't pass them today.
When we run tests, we have an exceptions file that lists tests that we don't run, with reasons, and one reason for not running a test is that it takes too long, or breaks limits in the product such as the length of a sequence.
Feel free to share that list! We try to run all tests (provided we support the features), but if we notice that none of us reading this manages to evaluate the tests, I might be better to drop them out of the test suite…
I've just done a pass on the changes.xml file in the qt4tests directory (currently there's no equivalent for the XSLT tests) ensuring that there's an entry for every substantive PR. The entries now cross-reference the PR where possible. The introspection test Catalog014 checks that every 4.0 test case links to an entry in the changes.xml file, while test Catalog015 checks that every entry in the changes.xml file is referenced by a set of test cases. Currently of course coverage is incomplete, but this mechanism allows us to track the coverage.
Here's the current coverage data (number of tests by feature):
<r>
<tests change="NameTestUnion"
count="37"
description="PR 606: Name test unions"/>
<tests change="arity-coercion"
count="7"
description="PR 254: Reducing arity in coercion rules"/>
<tests change="array-build"
count="4"
description="PR 360, 420: New function array:build"/>
<tests change="array-foot"
count="9"
description="PR 250: New function array:foot"/>
<tests change="array-get"
count="7"
description="PR 289: Extra argument to array:get"/>
<tests change="array-members"
count="6"
description="PR 360, 420: New function array:members"/>
<tests change="array-of-members"
count="6"
description="PR 420: New function array:of-members"/>
<tests change="array-slice"
count="71"
description="PR 477: New function array:slice"/>
<tests change="array-split"
count="11"
description="New function array:split"/>
<tests change="array-trunk"
count="6"
description="PR 250: New function array:trunk"/>
<tests change="array-values"
count="0"
description="PR 476: New function array:trunk"/>
<tests change="binary-literals"
count="8"
description="PR 433, 456: Binary notation in numeric literals"/>
<tests change="coercion-in-variables"
count="30"
description="Coercion rules are applied to variables"/>
<tests change="constructors"
count="0"
description="PR 408, 658[b]: Changes to constructor functions"/>
<tests change="defaulted-params"
count="36"
description="PR 166, 197, 375, 512: Default params in function declarations"/>
<tests change="downcasting"
count="10"
description="Downcasting in coercion rules"/>
<tests change="escape-solidus"
count="3"
description="PR 534: New serialization parameter escape-solidus"/>
<tests change="extended-annotations"
count="0"
description="PR 682: Boolean and negative annotation values"/>
<tests change="fn-QName"
count="0"
description="PR 207: arity-1 variant of fn:QName"/>
<tests change="fn-all-different"
count="31"
description="New function fn:all-different"/>
<tests change="fn-all-equal"
count="30"
description="New function fn:all-equal"/>
<tests change="fn-atomic-equal"
count="29"
description="PR 319: New function fn:atomic-equal"/>
<tests change="fn-build-uri"
count="39"
description="PR 215, 245, 347, 415: New function fn:build-uri"/>
<tests change="fn-char"
count="34"
description="PR 261, 306: New function fn:char"/>
<tests change="fn-characters"
count="12"
description="New function fn:characters"/>
<tests change="fn-codepoints-to-string"
count="7"
description="Function fn:codepoints-to-string becomes variadic (TODO???)"/>
<tests change="fn-concat"
count="14"
description="Function fn:concat allows sequence-valued arguments (TODO???)"/>
<tests change="fn-contains-sequence"
count="31"
description="PR 243: New function fn:contains-sequence"/>
<tests change="fn-decode-from-uri"
count="0"
description="PR 631: New function fn:decode-from-uri"/>
<tests change="fn-deep-equal"
count="85"
description="PR 320, 396, 543: New options argument to function fn:deep-equal"/>
<tests change="fn-doc"
count="0"
description="PR 430: fn:doc error handling"/>
<tests change="fn-duplicate-values"
count="126"
description="PR 614: New function fn:duplicate-values"/>
<tests change="fn-ends-with-sequence"
count="31"
description="PR 243: New function fn:ends-with-sequence"/>
<tests change="fn-every"
count="18"
description="PR 140, 152: New function fn:every"/>
<tests change="fn-expanded-QName"
count="9"
description="PR 207: New function fn:expanded-QName"/>
<tests change="fn-foot"
count="8"
description="PR 250: New function fn:foot"/>
<tests change="fn-format-integer"
count="17"
description="Changes to fn:format-integer, for example hex and binary output"/>
<tests change="fn-format-number"
count="2"
description="Changes to fn:format-number, decimal format supplied as QName"/>
<tests change="fn-highest"
count="32"
description="New function fn:highest"/>
<tests change="fn-identity"
count="5"
description="New function fn:identity"/>
<tests change="fn-in-scope-namespaces"
count="9"
description="New function fn:in-scope-namespaces"/>
<tests change="fn-index-where"
count="15"
description="PR 258: New function fn:items-where"/>
<tests change="fn-intersperse"
count="13"
description="PR 163: New function fn:intersperse"/>
<tests change="fn-is-NaN" count="16" description="New function fn:is-NaN"/>
<tests change="fn-items-after"
count="9"
description="PR 199: New function fn:items-after"/>
<tests change="fn-items-at"
count="24"
description="PR 249: New function fn:items-at"/>
<tests change="fn-items-before"
count="9"
description="PR 199: New function fn:items-before"/>
<tests change="fn-items-ending-where"
count="17"
description="PR 199: New function fn:items-ending-where"/>
<tests change="fn-items-starting-where"
count="17"
description="PR 199: New function fn:items-starting-where"/>
<tests change="fn-iterate-while"
count="16"
description="PR 210, 465: New function fn:iterate-while"/>
<tests change="fn-json-doc"
count="1"
description="Extra options to fn:json-doc"/>
<tests change="fn-load-xquery-module"
count="0"
description="PR 549: Changes to fn:load-xquery-module"/>
<tests change="fn-log" count="1" description="PR 629: New function fn:log"/>
<tests change="fn-lowest" count="32" description="New function fn:lowest"/>
<tests change="fn-op" count="34" description="PR 198: New function fn:op"/>
<tests change="fn-parcel"
count="6"
description="New function fn:parcel (TODO: drop???)"/>
<tests change="fn-parse-QName"
count="19"
description="New function fn:parse-QName"/>
<tests change="fn-parse-csv"
count="0"
description="PR 533: New function fn:parse-html"/>
<tests change="fn-parse-html"
count="1379"
description="PR 259, 330: New function fn:parse-html"/>
<tests change="fn-parse-integer"
count="30"
description="PR 434, 462: New function fn:parse-integer"/>
<tests change="fn-parse-json"
count="0"
description="New options for function fn:parse-json"/>
<tests change="fn-parse-uri"
count="40"
description="PR 215, 245, 347, 415: New function fn:parse-uri"/>
<tests change="fn-partition"
count="23"
description="PR 454, 507: New function fn:partition"/>
<tests change="fn-parts"
count="94"
description="New function fn:parts (TODO: not currently in spec)"/>
<tests change="fn-remove"
count="6"
description="PR 313: Changes to fn:remove (remove multiple items)"/>
<tests change="fn-replace"
count="0"
description="PR 612: Changes to fn:replace (new substitute argument)"/>
<tests change="fn-replicate"
count="21"
description="New function fn:replicate"/>
<tests change="fn-resolve-uri"
count="2"
description="PR 424, 426: Changes to fn:resolve-uri (empty sequence in arg 2; fragment id)"/>
<tests change="fn-slice" count="73" description="New function fn:slice"/>
<tests change="fn-some"
count="18"
description="PR 140, 152: New function fn:some"/>
<tests change="fn-sort" count="0" description="PR 623: fn:sort descending"/>
<tests change="fn-starts-with-sequence"
count="31"
description="PR 243: New function fn:starts-with-sequence"/>
<tests change="fn-transform"
count="2"
description="Changes to function fn:transform PR 427: Changes to function fn:transform"/>
<tests change="fn-transitive-closure"
count="15"
description="PR 521: Changes to function fn:transform"/>
<tests change="fn-trunk"
count="8"
description="New function fn:trunk PR 250: New function fn:trunk"/>
<tests change="fn-unparcel"
count="11"
description="New function fn:unparcel (TODO: drop???)"/>
<tests change="fn-void"
count="1"
description="PR 575: New function fn:void"/>
<tests change="fn-xdm-to-json"
count="152"
description="Extra options to fn:xdm-to-json"/>
<tests change="fn-xml-to-json"
count="4"
description="Changes to function fn:xml-to-json"/>
<tests change="hex-literals"
count="23"
description="PR 433, 456: Hex notation in numeric literals"/>
<tests change="if-curlies"
count="42"
description="PR 284: Curly braces in if expression"/>
<tests change="keywords"
count="289"
description="Keywords in static function calls"/>
<tests change="map-build"
count="33"
description="PR 203, 420: New function map:build"/>
<tests change="map-entries"
count="0"
description="PR 420: New function map:entries"/>
<tests change="map-filter"
count="14"
description="Changes to function map:filter"/>
<tests change="map-get"
count="6"
description="PR 289: Extra argument to map:get"/>
<tests change="map-group-by"
count="0"
description="New function map:group-by (TODO: status ???)"/>
<tests change="map-keys"
count="3"
description="PR 478, 515: Extra argument to map:keys"/>
<tests change="map-of-pairs"
count="18"
description="PR 360, 420: New function map:of-pairs"/>
<tests change="map-pair"
count="0"
description="PR 420: New function map:pair"/>
<tests change="map-values"
count="0"
description="PR 360, 420: >New function map:values"/>
<tests change="meta-elements"
count="0"
description="PR 342: Revise serialization rules for meta elements"/>
<tests change="misc-collation-optional"
count="0"
description="PR 590: $collation argument may be empty"/>
<tests change="multiple-for"
count="1"
description="PR 28, 344: Multiple for clauses in an expression"/>
<tests change="multiple-let"
count="2"
description="PR 28, 344: Multiple let clauses in an expression"/>
<tests change="numeric-underscores"
count="11"
description="PR 433, 456: Underscores in numeric literals"/>
<tests change="operator-symbols"
count="29"
description="PR 466, 544: Non-ASCII characters in operator tokens"/>
<tests change="otherwise" count="7" description="Otherwise operator"/>
<tests change="plausibility"
count="0"
description="PR 603: Implausible expressions"/>
<tests change="prod-EnumerationType" count="14" description="enum() types"/>
<tests change="prod-ForClause.member"
count="34"
description="For-member in for expressions"/>
<tests change="prod-InlineFunctionExpr.focus"
count="29"
description="PR 524: abbreviated inline functions - focus notation"/>
<tests change="prod-LambdaExpr"
count="9"
description="PR 550, 561: abbreviated inline functions - lambda notation"/>
<tests change="prod-LocalUnionType"
count="7"
description="local union types"/>
<tests change="prod-MappingArrow"
count="49"
description="Tests the =!> operator"/>
<tests change="prod-StringTemplate"
count="53"
description="PR 324: String templates"/>
<tests change="prod-ThickArrow"
count="0"
description="PR 545: Inline functions after arrow operator"/>
<tests change="prod-ThinArrow"
count="0"
description="PR 447: Thin arrow expressions"/>
<tests change="prod-UnionNodeTest"
count="12"
description="PR 286: Union node test"/>
<tests change="record-test" count="11" description="Record tests"/>
<tests change="switch"
count="8"
description="PR 364, 587, 671: Changes to switch expressions"/>
<tests change="try-catch-variables"
count="0"
description="PR 493: new variable for error information"/>
<tests change="typeswitch-braces"
count="1"
description="Curly braces in typeswitch expressions"/>
<tests change="window"
count="25"
description="PR 483: Changes to FLWOR window clause"/>
<tests change="xs-string"
count="0"
description="PR 546, 643: Non-XML characters in strings"/>
</r>
@benibela Have you observed some more tests that take a long time to be evaluated and that could be tweaked?
Well, these range tests do not take a long time . My implementation quickly fails with a "sequence count must fit in a signed 32-bit integer " error. I would need to have 800 GB RAM to attempt to run them
When I run the tests, I exclude "same-key-023,same-key-024,RangeExpr-407c,RangeExpr-407d,RangeExpr-409c,RangeExpr-409d" for performance, which are probably still from xqts 3.1
Now I have updated zhe testgs and cannot load the catalog anymore because of type="spec" value="XQ30 X"
in focus-function-019
I guess the referenced tests have all been added by me.
A lot of the more difficult tests come from Tim Mills and Oliver Hallam at CBCL. I think it's good to have these "stretch" tests. No-one is obliged to run them if they don't want to.
I've just checked our exceptions list for 4.0 tests and the majority are features or changes not yet implemented. But the tests
RangeExpr-408g
RangeExpr-408h
RangeExpr-408i
RangeExpr-408j
RangeExpr-408k
RangeExpr-410f
RangeExpr-410g
RangeExpr-410h
RangeExpr-410i
RangeExpr-410j
RangeExpr-410k"
are excluded because they exceed system limits.
We also exclude a few tests because we're constrained by bugs in third-party products, e.g. validator.nu.
On the XSLT side we also have some tests classified as "slow" which we run less frequently: mainly the comprehensive tests of regex character categories.
But the tests are excluded because they exceed system limits.
Maybe I should add FOAR0002
as alternative result to these particular tests?
fn-matches-50
is another test that seems to fail with all implementations I’ve just tried. – But maybe I shouldn’t extend the scope of this issue too much, as it was originally meant to collect the tests that currently have no counterpart in our 4.0 drafts.
A lot of the more difficult tests come from Tim Mills and Oliver Hallam at CBCL. I think it's good to have these "stretch" tests. No-one is obliged to run them if they don't want to.
In the past, it seemed to be important (at least to some people) to prove that you are 100% compliant. None of us would seemingly be able to claim that for the given test suite, but probably that’s ok as the focus has changed?
We certainly aim for 100% conformance, but there will always be a few problems with limits, resource usage, third-party dependencies, configuration issues, implementation-dependent features, etc that mean 100% conformance isn't the same thing as a 100% pass rate on the tests.
100% conformance isn't the same thing as a 100% pass rate on the tests.
You're certainly right. I had pages like the following one in mind that focused on the test suite conformance…
https://dev.w3.org/2006/xquery-test-suite/PublicPagesStagingArea/XQTSReportSimple_XQTS_1_0_2.html
…and implementors liked to stress that they pass lots of the tests to indicate that conformance is important to them.
Back then, we had one implementation (Saxon, of course) that passed 100% of the tests, and others were more or less close. Today, as I understand, no one would reach 100% anymore (good to know for me, I wasn't aware of that).
I think those reports allowed you to report tests as passed, failed, or not run with a stated reason. Valid reasons included the test having dependencies that the product doesn't support (e.g schema-awareness), the test exceeding implementation-defined limits, or the test being subject to an open bug report. On that measure, with the exception of a tiny handful of tests where we have problems with third-party libraries such as ICU, I think we still believe we can achieve 100% compliance once the specs and the implementation have stablised sufficiently. However, we no longer have any requirement to convince W3C that conforming implementations exist, so I don't think we need to go through this exercise.
I’ve added references to existing issues to the original comment.
The current state of the test suite is a bit messy, but I assume it would take too much time to document it here, and keep it update, so I propose to close this issue.
The CG agreed to close this issue without action meeting 062
The following functions are not defined in the current spec:
fn:unparcel
,fn:parcel
→ droppedfn:xdm-to-json
→ #576fn:concat()
→ see #701fn:parts
→ see #463codepoints-to-string()
, etc. (sequence-values arguments)