rdfjs / N3.js

Lightning fast, spec-compatible, streaming RDF for JavaScript
http://rdf.js.org/N3.js/
Other
676 stars 127 forks source link

Any way to recover from unescaped double quotes? #350

Closed excursus closed 1 year ago

excursus commented 1 year ago

The  2022-03 snapshot of dbpedia contains iris with unescaped double quotes such as this one:

<http://dbpedia.org/resource/"populate_or_perish"> <http://www.w3.org/2002/07/owl#sameAs> <http://rdf.freebase.com/ns/m.01182wk6> .

see here. This causes the lexer to throw this error:

    const err = new Error(`Unexpected "${issue}" on line ${this._line}.`);
                ^
Error: Unexpected "<http://dbpedia.org/resource/"populate_or_perish">" on line 820.

Is there any way to gracefully recover from errors such as these?

jeswr commented 1 year ago

No there is not, N3.js is designed to throw errors on invalid syntax as is the case with these IRIs (note that the definition of IRIREF explicitly forbids them in the production grammar).

I suggest you raise an issue with the DBPedia team to request they remove such invalid IRIs from their dataset.

TallTed commented 1 year ago

@excursus — I believe the DBpedia Information Extraction Framework is the best repository for issues such as this. If not, they should be able to point you in a better direction.