Open kenwenzel opened 2 years ago
I thought about this a bit:
OK, here is concrete plan:
All in all this is a breaking change to the storage formats of value store and triple store.
Hi, 2 questions:
Hi, 2 questions:
is this work slated for 5.0?
when is 5.0 targetted for release? @hmottestad
This isn't planned for 5.0 as far as I know. 5.0 is somewhat delayed. It's taken much longer to iron out bugs and compatibility issues than I had expected. There are still one or more things I need to look into before I can publish the last milestone build.
This isn't planned for 5.0 as far as I know. 5.0 is somewhat delayed. It's taken much longer to iron out bugs and compatibility issues than I had expected. There are still one or more things I need to look into before I can publish the last milestone build.
understood. do we have rough timelines for 5.0 release? q3/q4?
Not going to make any promises.
RDF-star support requires a rework of the ID encoding in the value store which would be a breaking change. When starting this I would try to create a future-proof extendable ID-scheme.
@kenwenzel can you share more info on your design to 1) get lmdb out of experimental and 2) add rdfstar? For (2), perhaps (1) work can position rdfstar as an additive later w/o breaking change.
We were going down the track of rocksdb but are looking at lmdb bc you've already integrated it with rdf4j so perhaps we can assist with it getting to prod.
The other thought is perhaps getting it to prod in 4x with uncertainty of 5x release even if not backward compat given it's still in experimental currently? What are your thoughts around that? Tx
@nguyenm100 Feature-wise the store is on par with NativeStore and additionally supports deletion of values. It would help if you could test it in a setting that is comparable to your production environment. One critical feature that would simplify future extensions is a better ID scheme. I've also thought about inlining values like Jena TDB2 does: https://github.com/eclipse-rdf4j/rdf4j/issues/4774
We could adopt a scheme that is comparable to Jena's. An important difference is that we use varints to encode the IDs and therefore we need to modify the scheme in a way that it always leads to small integer values. (flags and types need to be added in the lower bits, not in the higher ones)
@kenwenzel Hey Ken, we will definitely run lmdb through it's paces over the next quarter or so. Wanted to revisit the idea again with you about taking LMDB out of experimental status in 4.x as opposed to 5.x given that there doesn't seem to be a definitive timeframe on 5.x atm. are you open to that?
Hi @nguyenm100 ,
my opinion is that we can take out LMDB of experimental status after having at least the following issues fixed:
The first one is a breaking change to the data format and therefore I'm not sure if this could be backported to 4.x.x Especially the last one will need some careful investigation as you wont want your productive system to fail if a query gets cancelled due to a time limit.
Is it possible for you to start with the NativeStore and then switch to the LmdbStore at some later point in time? If not then what is your motivation for using the LmdbStore?
Hey @kenwenzel, we're looking at lmdbstore for the speed and large dataset support. per: https://rdf4j.org/javadoc/3.4.3/org/eclipse/rdf4j/sail/nativerdf/NativeStore.html only supports up to 100m triples.
Agree #4950 would be a backward breaking change, but my thought was that lmdb is still in experimental and not yet released so backward compat needn't be guaranteed. I make this judgement based on the fact that 5.x doesn't have a concrete release date atm. Also, moving to 5.x will introduce a lot of risk outside of just lmdb.
Problem description
The LMDB store does not yet support values of type
org.eclipse.rdf4j.model.Triple
. A simple solution could be to handle those triples like other RDF values and store them within the value store.Preferred solution
No response
Are you interested in contributing a solution yourself?
Perhaps?
Alternatives you've considered
No response
Anything else?
No response