dgraph-io / dgraph

The high-performance database for modern applications
https://dgraph.io
Other
20.42k stars 1.5k forks source link

support SPARQL #4487

Open wangdsh opened 4 years ago

wangdsh commented 4 years ago

I think SPARQL query is important. The reason can be seen here:
https://github.com/dgraph-io/dgraph/issues/1#issuecomment-266360074
Hope to support SPARQL query in 2020.

suesunss commented 4 years ago

@marvin-hansen @wangdsh

Agree. Cypher benifits mainly the application layer and easy-to-use, while Gremlin has a richer community that no other graph DBs can compete with as far as I am concerned, Gremlin also has a richer expressivity to perform any arbitrary complex graph traversals as you have mentioned in a previous post. The problem with Gremlin is, as the query goes more complex, it is more difficult to perform ad-hoc/automatic optimizations, which I think is an essential point for any querable databases, optimizations are even more difficult for users without a solid database and graph traversal virtual machine backgound.

But the real origin comes from SPARQL, which has a clean, simple syntax and yet a powerful expressivity. Really hope to see SPARQL support, this will benifit both graph DB and semantic web communities, and probably, have a deeper influence over future web technology.

MichelDiz commented 4 years ago

The biggest challenge with SPARQL. Is that this language was made to support Triple Store databases which follows W3C standards. Dgraph isn't an RDF Triple Store despite using RDF (in the simplest and most raw format).

IMHO, it's kind of chaotic having to maintain the support of different languages with different standards and different requirements/needs.

Imagine the chaos it would be to maintain GraphQL, GraphQL+-, Gremlin, Cypher and SPARQL. On each new Dgraph feature would be herculean work to sync. Hard to synchronize. Even if we support just one, so we should abandon GraphQL+- in favor of the new lang (and redesign Dgraph) that we don't have control (if we add features to Dgraph, we should ask for the lang maintainer to add to that language specs).

I believe it would be easier for you to select the features you like most in SPARQL (or any other) and ask for support in Dgraph than to add another language. Or even suggest changes in the GraphQL+- syntax.

That's my two cents as a user

BTW, my opinion doesn't reflects what Dgraph in general thinks.

Extra example of the difference between Dgraph and SPARQL

SPARQL uses "PREFIX" which is linked to the Identifier that is stored in the RDF store format. This Identifier in Dgraph is converted to UIDs. So, to make this to work. It is necessary to sanitize the dataset. Thus, making incompatible with any other RDF stores (means that when you export the RDF you gonna need to revert the sanitize and also you need to rebuild yourself the Identifiers).

Also, the sanitize would need a "hacky" way to the keyword "PREFIX" to work. And one approach should be defined. e.g use dgraph's type system or edges to represent the PREFIX inside Dgraph.

MichelDiz commented 4 years ago
  1. It is basically Posting Lists recorded in KV on BadgerDB. You can read more about it in the newly released paper https://github.com/dgraph-io/dgraph/blob/master/paper/dgraph.pdf

A RDF triple is basically a "KV" with an identifier.

Dgraph is a triple system, but not exactly a triple store per se.

  1. Yes.

  2. I am not sure. Because I was not present when Dgraph started. But I have a slight idea of ​​why.

Basically Dgraph was "mirroring" itself in GraphQL. And GraphQL had no specific mutation patterns other than JSON objects inside a mutation block. Perhaps the engineers who started the project with Manish had some familiarity with RDF (as they certainly took classes at the university with web semantics). And the RDF seemed to be an obvious choice for that moment. But not the whole package.

I have this slight idea after reading old commits. But I can ask Manish about it.

Anyway, GraphQL does not use web semantics so do we. And there was no demand for this feature. So the DB was maturing without web semantics. Even because, Dgraph is a DB aimed at common web services (like NoSQL is) and not Ontology or similar. Although you can do it, as any GraphDB is customizable. But it would not follow any specific standards. And you have to "fit" it in GraphQL+-.

I understand with the lacking foundation of a triple or quad store, there is actually very little than can be done to fully support RDF & SPARQL.

We can try to support JSON-LD. Which several RDF DBs can export their data. That's why I have opened some issues about this context https://github.com/dgraph-io/dgraph/issues/4897

And also https://github.com/dgraph-io/dgraph/issues/4898 https://github.com/dgraph-io/dgraph/issues/4915

All these are small steps to let users input data coming from RDF triple stores easily (I am studying the problems related to this). We could import JSON-LD and export a JSON file 99.9% similar to JSON-LD. Which is compatible with several tools out there.

What's the long-term vision of Dgraph?

We are discussing about it https://discuss.dgraph.io/t/dgraphs-new-versioning-scheme/6106/4

Do you mean SPARQL? I'm not sure. We have to finish the GraphQL specs support. There are a lot of things to be done to start a new adventure.

MichelDiz commented 4 years ago

We can still have totally support for JSON-LD. Cuz users can bring their data in. And if they wish, they can "sanitize" it using Bulk Upsert mutation. There is nothing hard in JSON-LD that we can't deal with.

WolfgangFahl commented 4 years ago

This issue is not closed - it's just moved - what a pitty. I'd love to join the discussion but https://discuss.dgraph.io/t/hope-to-support-sparql-query-in-2020/8809 is just not up to the task - the Discuss UI is IMHO horrible! Please get back to github.

WolfgangFahl commented 2 years ago

@MichelDiz - great that this is reopened. I might actually try dgraph again now.

mediaprophet commented 1 year ago

Any update on sparql support?

MichelDiz commented 1 year ago

Nope, this might take really longer. To support GraphQL took 1 year and a half of intensive work with a dedicated team. It would be nice to have support from the community. Perhaps researching the best approach helps advance planning. But so far there is zero work or research on sparql.

github-actions[bot] commented 3 months ago

This issue has been stale for 60 days and will be closed automatically in 7 days. Comment to keep it open.

WolfgangFahl commented 2 months ago

Why not go the snapquery way and have and independent named parameterized query layer? - i assume this would be a new issue but would fix the problem. See the snapquery demo at https://snapquery.bitplan.com/ - a scientific paper on the topic is currently under review.