fluree / db

Fluree database library
https://fluree.github.io/db/
Other
339 stars 22 forks source link

Support multiple graphs #460

Open zonotope opened 1 year ago

zonotope commented 1 year ago

RDF Datasets can consist of multiple graphs: exactly one default graph, and 0 or more named graphs. Fluree currently only supports the default graph. We should add support for named graphs in both queries and transactions.

See also:

dpetran commented 1 year ago

Need to decide what the syntax is for

bplatz commented 1 year ago

Named graphs in JSON-LD: https://www.w3.org/TR/json-ld11/#named-graphs

bplatz commented 1 year ago

The more I think about this, the more I think we don't allow multiple named graphs inside a single ledger.

I do think that we could allow you to specify multiple graphs in a single JSON-LD transaction, but we treat this as a single atomic transaction across multiple ledgers (which in theory could be distributed). To do this we'd have to introduce a wait/lock at a ledger level and a timeout that would fail the entire transaction unless all specified ledgers successfully updated in a specified time.

As for querying across multiple graphs, I just created a ticket #526 to specify how to do this. In this case FROM can contain multiple ledgers (graphs) that all act as the "default" graph for the query. This is per SPARQL spec. FROM NAMED allows you to narrow parts of the query to a specified graph, excluding anything in the default graph.

Of course querying across multiple graph indexes will be slower than one big index. I think we could offer an optimization where we physicalize multiple graphs into a single index set if multiple graphs will frequently be queried as one, and performance is critical. This could be a user-specified optimization, obviously impacting storage requirements and increasing compute required for indexing.