rdf-ext-archive / discussions

This repo is for discussions all over the rdf-ext project
3 stars 2 forks source link

Dataset distinct from Graph and similar to Store? #21

Open elf-pavlik opened 8 years ago

elf-pavlik commented 8 years ago

Can rdf-ext make distinction between a Dataset and a Graph? https://github.com/rdf-ext/rdf-ext-spec/blob/gh-pages/API.md#graphoptional-arraygraph-other

Graph could represent a simple named or default graph and only contain Triples Graph could provide simple access to it's name Dataset could represent multiple graph and contain Quads

Currently Store seems very close to Dataset https://github.com/rdf-ext/rdf-ext-spec/blob/gh-pages/API.md#store

But parsers like Trig, JSON-LD, N-Quads shouldn't return Store IMO

bergos commented 8 years ago

There is no distinction in rdf-ext, the Graph can be used for both cases. It supports Triples and Quads. If you mix them, Triples a treat like Quads with default graph.

The difference between a Store and Graph is more the sync vs. async interface.

Graphs in the version 1.x will also have the store interface, but the Store interface will only use Streams.

elf-pavlik commented 8 years ago

I see you have

Do you see problem with having distinct Graph which represents single graph and contain triples and Dataset which represents multiple graphs and contains quads? This way Graph could have .name to easily access it's name and maybe Dataset could have .graphs to list all graphs it contains.

Also a Dataset would allow to .add a Triple to particular graph and take care of converting it to a Quad.

bergos commented 8 years ago

You can attach the .name property to a Graph object if you want, but I don't want to make part of the API. One idea I had already in mind is a NodeSet or TermSet class. Then it would be possible to get a unique set of subjects, predicates, objects or graphs with the method with the same name (e.g. .subjects()). So your .name property would be equal to .graphs().shift().

I expect the new version of the store interface will no longer have the namedgraph parameter. Everything should be handled in the same way. Adding a graph to a store would be just a .pipe and the .graph property of the quads will be used. There will be also a method like .clone([subject], [predicate], [object], [graph]) or even a shortcut just for the graph to force the named graph. The code would look like this: graph.clone(null, null, null, namedGraph).pipe(store).

elf-pavlik commented 8 years ago

Does it mean that to know if instance of Graph represents a single named graph, a single default graph or a dataset (multiple graphs) one needs to write code which inspects all the triples/quads?

bergos commented 8 years ago

At the moment, yes. It's treated like any other part of the triple (SPO). But with the proposed new methods (.graphs()) this is done internal and can be implemented very efficient.

For example test if all quads have the default graph:

var namedGraphs = graph.graphs()

if (namedGraphs.length === 1 && namedGraphs[0].equals(DefaultGraph)) {
}
elf-pavlik commented 8 years ago

@RubenVerborgh do you have any JS code which implements http://ruben.verborgh.org/blog/2015/10/06/turtles-all-the-way-down/ ? I wonder if you have an opinion on this issue based on implementation experience. I'll also need to do some prototyping myself first, possibly I make wrong assumption that having distinction between a graph (named or default) which contains triples and with gives easy access to it's name, and a dataset which contains quads and also allows accessing single graphs, will make things more straight forward.

RubenVerborgh commented 8 years ago

@elf-pavlik Yes, all of our Linked Data Fragments client variants.

All still need to be merged into a single client, but the principle works.

bergos commented 8 years ago

@RubenVerborgh LDF started with triples and was extended to use quads later, right? From your experience, do you think there should be different classes to handle collections of triples and quads? I think that's the actual question of @elf-pavlik.

I think it can be done with a single class. The Graph class needs some new methods for easier handling of quads and using a class named Graph to store different named graphs could be confusing. Defining Dataset as alias could be the solution.

RubenVerborgh commented 8 years ago

@elf-pavlik No different classes needed to handle triples and quads. The LDF client needs a small part of separate logic (to extract metadata), but that's it.

elf-pavlik commented 8 years ago

Thanks @RubenVerborgh I'll take a look at LDF client.js code!

@bergos I'll do some prototyping and will base my further responses on that. I find it useful to have .graph() on an instance of rdf-ext Store, while it seems to me missing on an instance of the Dataset (rdf-ext Graph). It also seems confusing to me if we would to have .graph() on an instance of the Graph... as I said, need to write some more code first to get better feel of it.

elf-pavlik commented 8 years ago

NOTE: possibly terminology worth to reuse https://www.w3.org/TR/sparql11-service-description/

sd:Dataset sd:Graph sd:name sd:graph etc.

bergos commented 8 years ago

In version 1.x only the term Dataset will be used. The Dataset class is compatible with the Graph class. The Dataset will get a Store interface (maybe .store() will return an object with that interface).

elf-pavlik commented 8 years ago

Which Store interface? Streams Interface as defined in RDFJS TF or rdf-ext Store Interface. If the second, what Promise graph (RDFNode|String namedGraph, GraphCallback callback) will resolve with?