Open nusakan opened 2 weeks ago
Thank you very much for your fix which is more or less done in the latest eye and lingua.
Thank you for your fast commit.
It seems however that there are still few failures. I've written a small test suite in Notation3 (extended to other string:
rules), covering a larger scope (like false positives/negatives or errors).
I've uploaded here: string_test.n3.txt
(renamed to .txt
as github does not recognize .n3
as text files)
You may run it with:
eye --quiet --nope string_test.n3
This test suite may not be quiet exact (I've not found an explicit documentation for the all the
string:
rules), you may review it before considering the test should pass or fail.
Here are the failing properties as exposed by the test suite alongside the involved tests (see the test suite for reference) (some core builtin rules are still there) with EYE v10.19.8 (2024-08-30)
(master branch):
@prefix string: <http://www.w3.org/2000/10/swap/string#>.
@prefix : <http://example.com/string_test.n3#>.
string:greaterThan :fail (:t090 :t091 :t092 :t093 :t094 :t095).
string:lessThan :fail (:t110 :t111 :t112 :t113 :t114 :t115).
string:matches :fail (:t140).
string:notContainsRoughly :fail (:t150).
string:notGreaterThan :fail (:t173 :t174 :t175 :t170 :t171 :t172).
string:notLessThan :fail (:t183 :t184 :t185 :t180 :t181 :t182).
string:notMatches :fail (:t190).
string:replaceAll :fail (:t210 :t211 :t215).
string:scrapeAll :fail (:t231 :t232).
string:search :fail (:t244).
string:substring :fail (:t261 :t262).
string:upperCase :fail (:t270).
string:contains :fail (:t020).
string:containsIgnoringCase :fail (:t030).
string:containsRoughly :fail (:t040).
string:endsWith :fail (:t050).
string:format :error (:t081).
Best regards
Thank you for your fast test development👍 Your cases should also perfectly fit in https://github.com/eyereasoner/eye/blob/master/reasoning/bi/biP.n3#L111 Anyway, I first try to fix the issues and some 30 of them are now fixed in https://github.com/eyereasoner/eye/releases/tag/v10.19.9 and https://github.com/eyereasoner/lingua/releases/tag/v1.4.1
The remaining ones are now
string:matches :fail (:t140).
string:notMatches :fail (:t190).
string:replaceAll :fail (:t210 :t215).
string:scrapeAll :fail (:t231 :t232).
string:search :fail (:t244).
string:substring :fail (:t262).
string:format :error (:t081).
string:replaceAll :error (:t211 :t212).
We are now at
string:matches :fail (:t140).
string:notMatches :fail (:t190).
string:replaceAll :fail (:t210 :t215).
string:scrapeAll :fail (:t231 :t232).
string:search :fail (:t244).
It appears that raised errors are independent of tests success or failure.
Errors such as:
** ERROR ** eam ** error(syntax_error(missing closing parenthesis),_624)
[...]
.. are raised with the following passing tests:
:t203 :assertNone { ("a\tb" "\\" "-") string:replace "a-tb" } .
:t213 :assertNone { ("a\tb" ("\\") ("-")) string:replaceAll "a-tb" } .
:t223 :assertNone { ("a\tb" "(\\)" ) string:scrape ?t223v } .
:t234 :assertNone { ("a\tb\nc" "(\\)") string:scrapeAll ("t") } .
While the following failing tests do not raise any error:
:t215 :assertNone { ("\tb" ("\t" "b") ("x" "y")) string:replaceAll "xy" } .
:t232 :assertNone { ("a\tb\nc" "(\t|\n)") string:scrapeAll ("\t" "\n") } .
:t244 :assertNone { ("a\\\t" "(\\+)") string:search ?t244v } .
I see no problem for regular expressions expressed with wrong syntax to have errors turned into warning messages, provided it follows the SNAF rule, as it provides substantial information (even if hard to debug). However for consistency reasons, the same behavior should be adopted in any case.
I guess those exceptions might have to do with https://github.com/SWI-Prolog/packages-pcre/issues/24 so will stay tuned. After a few tweaks we are now at
string:replaceAll :fail (:t211 :t215).
string:scrapeAll :fail (:t231 :t232).
string:search :fail (:t244).
I may be wrong but some
eye
reasoner behaviors may be incorrect.Escaped characters issue
Some builtin
string: <http://www.w3.org/2000/10/swap/string#>
operations have unexpected behaviors when used with strings containing backslashes ("\\"
,"\t"
...) characters, especially when regular expressions are involved.For example:
"\n" string:length ?v
?v
is1
?v
is2
("a\\b" "(\\\\)" "-") string:replace ?v
?v
is"a-b"
?v
is"a--b"
("a\\b" "(.*)") string:scrape "a\\b"
A quick fix may be implemented for specific builtin core string functions.
However, it seems that the internal representation of literals relies on the N3 strings serialization syntax:\
"\t"
is represented as\
andt
which produces these weird behaviors (even "segfault"s in some more complex cases).So the quick fix may not be way to go: it introduces supplementary computational efforts, and only addresses partially the issue.
Thanks for your attention (and all the work you have already done).
Quick & dirty fix (
v10.19.7
- for illustration purpose only).string:scrape
,string:replace
,string:length
)escape_atoms
rule ensure the conversionSample test cases