Open janbuchar opened 2 months ago
We may not need the global accessors (e.g., Dataset.open)
The open
methods are actually the only ones we might need.
We should cleanly separate filesystem-backed and memory-only storage
I am still not sure about this, we still need a way to offload things from memory, otherwise, a long-running scape would fail at some point.
We should cleanly separate filesystem-backed and memory-only storage
I am still not sure about this, we still need a way to offload things from memory, otherwise, a long-running scape would fail at some point.
As long as the memory-only thing is opt-in and the default is filesystem-backed, I'd argue we don't :shrug:
I see, so the default would be the file system one. Then I am not sure what benefits this would bring, pretty much all the problems we had were about the file system part. And that implementation would still require a lot of in-memory caching most likely, I would even say it would end up as the same as the current memory storage.
I'd personally be opposed to making fs-backed default, especially for requests (as we've seen the amount of troubles that causes, especially during concurrent accesses)... But we can definitely optimize some parts further I think
Can we support multiple non-named ("default") storages?
Should work with current memory-storage, does it not? memory-storage definitely supports it to my knowledge (.
Should work with current memory-storage, does it not? memory-storage definitely supports it to my knowledge (.s.getOrCreate() -> new UUID)
This would work out of box already if we remove the storage manager cache, its not about the storage implementations, its about the fact that you calling open()
twice resolves to the same default storage.
There are multiple considerations:
Dataset.open
)