eyereasoner / eye

Euler Yet another proof Engine
https://eyereasoner.github.io/eye/
MIT License
124 stars 17 forks source link

Issue with string escaping #66

Closed ktk closed 1 year ago

ktk commented 1 year ago

Given the following input file:

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix schema: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix ex: <http://data.table.org/> .
@prefix table: <http://schema.table.org/> .

<http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO> a table:Table ;
    table:contextLabel "BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO" ;
    schema:name "\"DA_ALLGewaesser_BIO\"" ;
    table:column <http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO/column/gew_ID>, <http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO/column/gew_fk_ID>, <http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO/column/gew_Name>, <http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO/column/gew_Code>, <http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO/column/gew_Typ>, <http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO/column/gew_Childs> ;
    table:hasUniqueContraint <http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO/constraint/SYS_PK_11701> .

and the following N3 file:

@prefix schema: <http://schema.org/>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix string: <http://www.w3.org/2000/10/swap/string#>.
@prefix table: <http://schema.table.org/> .

#
# Access: remove " and add it as rdfs:label 
#
{
  ?iri a table:Table ;
    schema:name ?label .
#  (?label "\"" "") string:replaceAll ?cleanedLabel
}
=>
{
  ?iri rdfs:label ?label .
  #?iri rdfs:label ?cleanedLabel .
} .

This generates invalid Turtle:

@prefix schema: <http://schema.org/>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix string: <http://www.w3.org/2000/10/swap/string#>.
@prefix table: <http://schema.table.org/>.

<http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO> rdfs:label ""DA_ALLGewaesser_BIO"".

It looks like the escaping of " gets lost.

Bug or am I missing something?

ktk commented 1 year ago

And as followup-question, would the replaceAll call be correct that way given the "-handling would work properly? It doesn't do anything in my tests.

phochste commented 1 year ago

In EYE v2.7.3 josd the example above creates valid Turtle

<http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO> rdfs:label "\"DA_ALLGewaesser_BIO\"".

(using eye --nope --quiet --pass-only-new test.n3)

The string:replaceAll requires lists as input as in:

{
  ?iri a table:Table ;
    schema:name ?label .
  (?label ("\"") ("")) string:replaceAll ?cleanedLabel
}
=>
{
  ?iri rdfs:label ?label .
  ?iri rdfs:label ?cleanedLabel .
} .

and produces

<http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO> rdfs:label "\"DA_ALLGewaesser_BIO\"".
<http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO> rdfs:label "DA_ALLGewaesser_BIO".
ktk commented 1 year ago

Interesting, I'm on EYE v2.3.0 josd. Will upgrade and report back.

Thanks for the hints regarding replaceAll, looking forward to the new documentation you mentioned in discussions!

ktk commented 1 year ago

I am now on EYE v2.7.4 josd, I run:

eye --nope --quiet --pass-only-new test.n3 --turtle test.ttl > sugus.ttl

And the string is still:

<http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO> rdfs:label ""DA_ALLGewaesser_BIO"".

Not sure what I do different? I'm on MacOS by the way.

ktk commented 1 year ago

I also tried your rule and I get an empty result. Once I comment the ?cleanedLabel lines, I get the same result as above (wrong escaping)

phochste commented 1 year ago

Ah I put everything in one file test.n3

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix schema: <http://schema.org/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix ex: <http://data.table.org/> .
@prefix table: <http://schema.table.org/> .
@prefix string: <http://www.w3.org/2000/10/swap/string#>.

<http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO> a table:Table ;
    table:contextLabel "BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO" ;
    schema:name "\"DA_ALLGewaesser_BIO\"" ;
    table:column <http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO/column/gew_ID>, <http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO/column/gew_fk_ID>, <http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO/column/gew_Name>, <http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO/column/gew_Code>, <http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO/column/gew_Typ>, <http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO/column/gew_Childs> ;
    table:hasUniqueContraint <http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO/constraint/SYS_PK_11701> .

#
# Access: remove " and add it as rdfs:label 
#
{
  ?iri a table:Table ;
    schema:name ?label .
  (?label ("\"") ("")) string:replaceAll ?cleanedLabel
}
=>
{
  ?iri rdfs:label ?label .
  ?iri rdfs:label ?cleanedLabel .
} .

And ran:

% eye --nope --pass-only-new --quiet test.n3
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.
@prefix schema: <http://schema.org/>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix ex: <http://data.table.org/>.
@prefix table: <http://schema.table.org/>.
@prefix string: <http://www.w3.org/2000/10/swap/string#>.

<http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO> rdfs:label "\"DA_ALLGewaesser_BIO\"".
<http://data.table.org/BIO-DB-DATA-metadata/DA_ALLGewaesser_BIO> rdfs:label "DA_ALLGewaesser_BIO".

I'm also on a mac. But indeed the --turtle on the data file alone gives your parsing error , I can confirm.

ktk commented 1 year ago

A good catch. I put it all in one file and that way it works too. Not a solution for the use-case I have but at least we can reproduce it both now.

josd commented 1 year ago

Thanks for the observation @ktk and sorry that it took such a time, but there was indeed an issue with " escaping in literals when using --turtle. It should now be fixed in EYE v3.3.4

josd commented 1 year ago

In short: eye has an N3 parser which can of course also parse Turtle but is DCG based (Definite Clause Grammar) and as such a factor 5 to 6 slower than the --turtle invoked parser which is native C but only for Turtle files.

ktk commented 1 year ago

Ah I was curious about the difference. Thanks for the fix! BTW Eye made me look up all the stuff I once learned about Prolog, really nice work :)