linkeddata / rdflib.js

Linked Data API for JavaScript
http://linkeddata.github.io/rdflib.js/doc/
Other
564 stars 143 forks source link

Updater does not serialize carriage return line breaks (\r\n) correctly #517

Closed angelo-v closed 2 years ago

angelo-v commented 2 years ago

If a literal contains \r\n the carriage return is serialized to a real line break in the resulting SPARQL Update query, which leads to invalid syntax. This is demonstrated via a unit test in https://github.com/linkeddata/rdflib.js/pull/516/commits/40a50b89725652f3f781201ecdbdc7b5be81b8b3#diff-15043c9b2c959a81026ffd28eae448814aa6d5b02c2a6e3f23e09488ad6b11b5R54

Resulting query:

INSERT DATA { <https://pod.example/test/foo#subject> <https://pod.example/test/foo#predicate> "literal
\nvalue" .
 }
angelo-v commented 2 years ago

Problem seems to originate from Literal.toNT which replaces \n but not \r

timbl commented 2 years ago

What else should be included? \t ? What defines that list? Maybe the N3 string serialization which IIRC originally copied or pointed to python.

angelo-v commented 2 years ago

Indeed there may be more to consider

[9] STRING_LITERAL_QUOTE ::= '"' ([^#x22#x5C#xA#xD] \  ECHAR \  UCHAR)* '"'

see https://www.w3.org/TR/n-triples/#grammar-production-STRING_LITERAL_QUOTE and https://stackoverflow.com/a/40834612/758813

I will write some test cases

angelo-v commented 2 years ago

so according to the spec (link in my last comment) #x22 ("), #x5C (), #x0A (\n) and #xD (\r) are disallowed and need to be replaced

from those only carriage return was missing, which I now added in https://github.com/linkeddata/rdflib.js/pull/516