rdfjs / dataset-spec

RDF/JS: Dataset specification 1.0 – This specification provides a definition how to store multiple quads in a so-called dataset.
https://rdf.js.org/dataset-spec/
6 stars 5 forks source link

High level RDF Store api #42

Open dmitrizagidulin opened 7 years ago

dmitrizagidulin commented 7 years ago

I've been working on an implementation of an RDF store LDP backend for Solid (something like rdf-store-ldp). rdf-store-ldp follows the older Store API (specifically, the https://github.com/rdf-ext/rdf-store-abstract api, which had the add/remove etc) methods.

I see that the current low-level spec Store API has been simplified, and only has the following methods:

So, couple of questions:

  1. What's the relationship between the low-level Store API and something like rdf-store-ldp (which would serve as an interface to an actual RDF quad store, etc).
  2. What, if any, is the relationship between a potential high level Store API and the High Level Dataset API?
  3. Specifically, re Store API, what about add(), merge() and graph()? (that the high level store api had, but not the low level)?

Is there any interest in standardizing (as a High Level api), or at least discussing, to what an rdf store api would look like (similar to the Dataset)?

elf-pavlik commented 7 years ago

What, if any, is the relationship between a potential high level Store API and the High Level Dataset API?

I remember from conversation with @bergos in https://github.com/rdf-ext/discussions/issues/21

The difference between a Store and Graph is more the sync vs. async interface.

Specifically, re Store API, what about add(), merge() and graph()? (that the high level store api had, but not the low level)?

I need to look up definitions for those methods, but it seems that they can get implemented just as convenience layer using low lever import and match

bergos commented 7 years ago

Started today to implement the new interface for rdf-store-sparql. I think it can be done like this:

rdf.dataset().import(store.match(null, null, null, iri)).then((dataset) => {
  ...
})

Will update the examples when I finished the implementation.

RubenVerborgh commented 7 years ago

@bergos Beautiful!

bergos commented 7 years ago

You can have a look at the SPARQL Store implementation now. I need to update the doc, but the tests show already how to distinguish between add and merge.

elf-pavlik commented 5 years ago

What, if any, is the relationship between a potential high level Store API and the High Level Dataset API?

I find Stage 3 async iteration statement: for-await-of very developer friendly. I understand that we may not use it directly for Source.match() since it may emit prefix, base etc. but Store.match() could return async iterator which one could use together with for-await-of loop.

If someone else finds such interface attractive, I wonder where in the RDFJS specs ecosystem it could fit?

jacoscaz commented 5 years ago

I agree with @elf-pavlik as I find async iteration quite nice from a development experience standpoint. However, I also find it deceptively trick (i.e. it requires good knowledge of promises to deal with the effect of high numbers of iterations on memory usage) and @RubenVerborgh mentioned performance penalties when compared against approaches like asynciterator's: https://github.com/RubenVerborgh/AsyncIterator/issues/10

bergos commented 5 years ago

In general I'm not against the for-await-of, but I would like to avoid changes which may require major changes in implementations. Maybe we can define it in a next version, but I'm not so happy about enforcing it now. Adding it optional, just makes the spec more complicated.

For a feature where there is a high chance, that it's implemented in different ways I would have a different opinion, but I don't fear we will have a problem with for-await-for.

Anyway if stream in Node >=10 or the latest readable-stream is used, we get the feature out of the box: readable Symbol.asyncIterator. That should be already the case for all packages I moved from rdf-ext to rdfjs.

elf-pavlik commented 5 years ago

It looks that WHATWG RedeableStream will also end up async iterable: https://github.com/whatwg/streams/issues/778#issuecomment-380845727

RubenVerborgh commented 5 years ago

I recommend against for-await-of until we have performance tests showing them to be of equal or better performance. Triples are very small data objects, we are often dealing with millions. You do not want the performance penalty I've seen with async iteration so far. I'm sure it will get better, but we should only switch at that point.