SynBioDex / SBOL-specification

The Synthetic Biology Open Language (SBOL)
http://sbolstandard.org
14 stars 9 forks source link

Recommending preferred RDF formats #379

Closed jakebeal closed 3 years ago

jakebeal commented 4 years ago

While choosing RDF with SBOL 3 means we don't have to write our own serialization format, there are a bunch of different RDF serialization formats, and not all RDF libraries support all formats. As such, it would likely be good for us to pick one or two that are recommended best practices and make sure that all of the core tooling supports and defaults to those formats.

As a starting point for discussion, Wikipedia lists seven such formats: Turtle, N-Triples, N-Quads, JSON-LD, N3, RDF/XML, and RDF/JSON. Of these, I see arguments in favor of four:

My recommendation would be to go with N-Triples and JSON-LD, but I think there needs to be a broader community discussion.

cjmyers commented 4 years ago

I suggest that we recommend that SBOL libraries support all 4 formats listed above.

jakebeal commented 3 years ago

Supporting all is nice (though it might be problematic in some languages), but what are the priorities and defaults?

cjmyers commented 3 years ago

If I have to pick just two, it would be Turtle for readability and RDF/XML for exchange. I think all four have some uses though.

jamesamcl commented 3 years ago

We intentionally avoided choosing a serialization in the spec to avoid this debate. I don't think supporting all will be problematic in any of the major languages which have mature RDF libraries. Maybe an issue for more esoteric stacks, but not sure it's worth spending any significant time on. I would rather focus on SBOL as an abstract data model/ontology and leave the issue of serialization to the tooling.

jakebeal commented 3 years ago

Unfortunately, this issue isn't just restricted to esoteric languages. This issue came up already for us on pySBOL3, since it turns out that Python rdflib doesn't support either of the JSON formats.

jamesamcl commented 3 years ago

Seems like there are libraries available?

https://github.com/RDFLib/rdflib-jsonld

https://github.com/RDFLib/rdflib-rdfjson

jakebeal commented 3 years ago

Right, but what's the priority on supporting these format vs. others? Do we roll this into the library, or just leave it as an exercise for the reader? What are the implementers for other libraries planning? Choosing RDF should simplify all of these conversations, and we don't necessarily have to commit in the spec, but it would be a good idea to at least discuss to see what people's needs, plans, and constraints are with respect to this.

jamesamcl commented 3 years ago

I feel that this is a conversation that needs to happen between the library implementors, rather than necessarily an issue with the specification. Maybe a topic for the next implementation working group meeting?

cjmyers commented 3 years ago

Which is this Wednesday

goksel commented 3 years ago

My plan for the Java version is to support these five versions: Turtle, N-Triples, JSON-LD, RDF/XML, RDF/JSON. It may be useful to include best practices or some guidance about why and how each of these serialisation would be useful. For N-Triples, I would add "The ability to split up large SBOL data into smaller chunks, merge them later more easily".

jakebeal commented 3 years ago

The common ground consensus appears to be: Turtle, N-Triples, JSON-LD, RDF/XML

And we will want to add this to the serialization section of the spec.