Closed victorporof closed 4 years ago
@ncloudioj For more context, see https://docs.google.com/document/d/1aoVEcn1PR-W8qE2wOx0FSchO8qMpqaAzbt4pHktf-Wc
Also fixes https://github.com/mozilla/rkv/issues/117 and #2
@victorporof could you please add some comments to outline the major changes of this patch? The attached document has given me a great holistic rationale for this change, it will be very helpful for me to reivew the code with those inputs.
Thanks!
@ncloudioj To recap, here's how I've split up this PR. It would be excellent if we could land this today.
This commit makes it so that changeset no. 2 (a4d76eb) is readable in case of src/store/integer.rs
.
Isolates LMDB's surface area that this crate requires, splitting it into a set of interdependent traits. Structures implementing these traits are usable by RKV as a backing engine for key-value storage.
This commit also implements these traits for newtypes wrapping LMDB itself. This showcases the only changes required to user code when picking the LMDB backend, which involves the turbofish syntax (e.g Rkv::new<Lmdb>(...)
).
Changeset no. 5 (4a955a7) actually implements the safe mode backend.
The current API requires that an environment returns an owned database (a.k.a. not a reference to one) to consumers when calling open_db
or create_db
. This database type must be the same everywhere, and the instance returned must refer to the same database with every call as well, a constraint expressed through the Database
trait and associated types in the various traits defined in commit no. 2.
In case of LMDB, we do this by simply copying the structs representing LMDB databases. This is cheap in those situations because they're just wrappers around pointers to the C data structures.
In case of the safe mode implementation, we don't want to clone the entire database in this scenario. Returning an owned database without duplicating its contents is easy through smart pointers such as Arc
. However we can't copy an Arc<T>
, and since we don't want to do this unsafely, dropping the Copy
trait requirement is (probably) the way to go.
It's possible to keep the Clone
requirement instead of Copy
if we're particular about having the Database
types clone-able, however this doesn't seem necessary and I'd prefer to have consumers wrap databases in smart pointers themselves if they need this.
Same as above (commit no. 3). A smart pointer such as RwLock
cannot easily derive equality, and we need to wrap Databases in reference counted read-write locks to satisfy the existing API and avoid unnecessarily duplicating entire database when calling open_db
or create_db
, as well as in other cases. We could manually check equality with built-in methods like Arc::ptr_eq
, but things get complicated for Arc<RwLock<T>>
, and in practice this feature doesn't seem necessary.
Implements the traits describing a backing store (defined in commit no. 2). We don't aim for feature completeness here, just enough so that various RKV tests still pass (to verify that the implementation actually behaves like LMDB in the most common cases). CRLite support dictates how relaxed we can be about performance.
These commits add more tests.
Addressed comments.
Landed in feature branch safe-mode
: https://github.com/mozilla/rkv/tree/safe-mode
Closing in favor of https://github.com/mozilla/rkv/pull/159
WIP for allowing Rkv to use multiple backing stores. "Safe mode" backing store implementation should now be a breeze.