Open JervenBolleman opened 2 years ago
Actually trying to implement this I realized this is not sufficient. For collections that are backed by disk e.g. mapdb ones, we need to release these resources as soon as possible. This means that these items need to be created and maintained in a context.
Pushed a nice idea, regarding the slow group by for the lmdb store. Quickly pushed to github before laptop battery dies.
Problem description
As discussed in #3797 we often need to materialize values to be able to store them in a list. However, we can often do even better, if were able to optimize the collections knowing how the values are implemented. For example we often use a primitive long to identify a value in a store. This means that for a value set we could store these as the primitive value and regenerate them as a Value on demand. This would both avoid materializations as well as improve memory density.
This would also allow an improved serialization setup. We almost always fall back to java serializing the string representation of IRIs etc.
We can use the current java serialization one
With a corresponding
ByteArrayToValue
interface; but also a pair likeThis should make the sort and group by code etc. faster
Preferred solution
Have a
getCollectionFactory()
method with a default implementation on the sail that can provide such a collection on demand.Are you interested in contributing a solution yourself?
Yes
Alternatives you've considered
Improving the hashcodes which is still worth it but a different problem.
Anything else?
No response