marklogic / marklogic-jena

Adapter for using MarkLogic with the Jena RDF Framework
Other
5 stars 11 forks source link

Make getGraph a view of the database, not a copy of part of it. #29

Closed afs closed 7 years ago

afs commented 8 years ago

At the moment, MarkLogicDatasetGraph.getGraph is a copy of the graph in the database.

This means that:

  1. Changes do not propagate back to the database
  2. The size of graph that can be worked with is limited by RAM

The alternative is that the graph returned is a view; add, delete and find are mapping operations, find(s,p,o) to mapped to find(graphName, s,p,o) etc.

grechaw commented 8 years ago

You've observed something I definitely missed about deeper integration with jena. I have to consider it an enhancement request, and it's obviously not trivial to do, but I will adovate for it.

grechaw commented 8 years ago

Assigning to @sbuxton so that we'll have a conversation about scope and timing.

grechaw commented 8 years ago

Setting to milestone 1.0.2 to try to scope a release

afs commented 8 years ago

Jena has the class GraphView to provide this. It used by a number of DatasetGraph implementations.

grechaw commented 8 years ago

Oh, thank you for that pointer @afs, I thought that was an interface I had to implement.

grechaw commented 8 years ago

I was able to get most of the way fixing this. The lingering issue is that I never implemented a reasonable delete cache, because add/delete previous to this implementation were not tied to server-side state. Having a delete cache will align the behavior of the sesame and jena libraries.

afs commented 8 years ago

In certain applications, we have found that buffering inserts and deletes is necessary. Both are need to be done so that a Graph.find(?,?,?) can check the local buffered changes to generate the right answers.

grechaw commented 8 years ago

That's enough evidence for me.