linkeddata / rdflib.js

Linked Data API for JavaScript
http://linkeddata.github.io/rdflib.js/doc/
Other
564 stars 143 forks source link

Serializing to Turtle results in unused prefix and missing @base #251

Open b-zee opened 5 years ago

b-zee commented 5 years ago

When serializing a graph to Turtle, it seems there is an extra unused prefix.

Consider the following snippet:

const rdf = require('rdflib');

const graph = rdf.graph();
const FOAF = rdf.Namespace('http://xmlns.com/foaf/0.1/');

graph.add(rdf.sym('http://me.com/'), FOAF('knows'), rdf.sym('http://him.com'));
graph.add(rdf.sym('http://me.com/'), FOAF('knows'), rdf.sym('http://her.com'));

const serial = rdf.serialize(undefined, graph, 'http://me.com/', 'text/turtle');
console.log(serial);

This produces:

@prefix : </#>.
@prefix me: <>.
@prefix n0: <http://xmlns.com/foaf/0.1/>.

me: n0:knows <http://her.com>, <http://him.com>.

The empty : prefix is not used.

Also, I'm missing the @base or any reference to the main subject (http://me.com/).

kjetilk commented 5 years ago

Yes, and this highlights an issue that have been on my mind for some time. The Turtle produced by the serializer isn't very readable and easy to understand intuitively, and that is one of Turtle's strengths.

It may seem like a small thing, but I believe it is not. The Turtle must be very readable, so that newcomers can read it intuitively.

We must also make sure prefixes are good. Over in the Perl world, we have three modules for making sensible qnames: RDF::Prefixes which tries to guess prefixes by looking at the URI, RDF::NS which uses http://prefix.cc/ which is a crowdsourced list of prefix-URI pairs, and RDF::NS::Curated which is a curated such set. URI::NamespaceMap is used to guess and manage prefix-URI pairs. Perhaps we should do something similar to ensure that we always have good qnames?

Then, I think we should think about serializing with some assumptions that are not actually a part of the data model if possible. Like, we use qnames for all the "vocabulary terms", but not for resources that we talk about. For that, we should use @base and pointy brackes. Perhaps there should be a bit more logic in the serializer, but I think that would help a lot.

ManuelTS commented 2 years ago

@kjetilk, any update from this? I'm having the exact same issue with the two @prefixes and a missing @base. If you process a file having a base, the output misses one, see #245

kjetilk commented 2 years ago

I'm not aware of anything, I'm afraid.