In general, this PR makes locking around DB accesses (both read and write) less granular.
There is a tradeoff here that became more apparent the further I dug into things.
This may make functions that perform a lot of DB accesses a bit more performant. It might lessen occurrence of deadlocks.
For certain, it:
forces all other threads to block until these functions complete.
requires allocating memory for all values retrieved from get_many(), get_all().
I have commented the locking issue in the code at each point of usage.
Our previous approach processing each value at a time might be a bit slower and acquire many more locks, but it is much more concurrent and works with arbitrary number of values without exploding memory for large datasets, eg wallets with thousands of monitored utxo. For this reason, I am making this a draft PR, and not recommending it be merged until further testing/benchmarking/analysis is conducted.
"As opposed to the original, this implementation is not concurrent (yet):
Whereas the leveldb crate (wrapping C++ leveldb) does support concurrency. Also the DB::get() fn in rusty-leveldb requires (&self mut), so we cannot use an RwLock, but must instead use a Mutex. Readers block eachother as well as writers. I believe that if we were to move to the leveldb crate we could get rid of the locking around DB calls and the code in this PR would work with good concurrency. Note also that get() method takes &self. We should not need any locks around these db calls (greatly simplifying our code), but if we wanted to make our own RwLock, we could.
Also, there is a draft PR for twenty-first that provides iterators for the DB types. This would enable usage of the get many/all set-based semantics in this PR without blowing up memory for large data sets. So that is a possibility as well and would work best with the concurrent version of leveldb.
NOTE: I do not endorse using this library for any data that you care about.
one last clarification: I find this code a bit cleaner and more readable than what came before, so I hope we can merge it, or something close if/when the above issues are resolved.
feat: use get/set_many, get/set_all() for DB
Use multi-item DB operations instead of locking DB per item
Cargo.toml is modified to override twenty-first dep with
master from github.
Author: sword-smith thor@neptune.cash
Date: Thu Nov 9 21:57:36 2023 +0100
Improve lock-handling in method to prune abandoned MUTXOs
Co-authored-by: Dan <danda@neptune.cash>
In general, this PR makes locking around DB accesses (both read and write) less granular.
There is a tradeoff here that became more apparent the further I dug into things.
This may make functions that perform a lot of DB accesses a bit more performant. It might lessen occurrence of deadlocks.
For certain, it:
I have commented the locking issue in the code at each point of usage.
Our previous approach processing each value at a time might be a bit slower and acquire many more locks, but it is much more concurrent and works with arbitrary number of values without exploding memory for large datasets, eg wallets with thousands of monitored utxo. For this reason, I am making this a draft PR, and not recommending it be merged until further testing/benchmarking/analysis is conducted.
HOWEVER: the elephant in the room is the nature of the DB accesses. The
rusty-leveldb
crate we are using does not support concurrent accesses:Whereas the leveldb crate (wrapping C++ leveldb) does support concurrency. Also the DB::get() fn in
rusty-leveldb
requires (&self mut), so we cannot use anRwLock
, but must instead use aMutex
. Readers block eachother as well as writers. I believe that if we were to move to theleveldb
crate we could get rid of the locking around DB calls and the code in this PR would work with good concurrency. Note also thatget()
method takes &self. We should not need any locks around these db calls (greatly simplifying our code), but if we wanted to make our own RwLock, we could.Also, there is a draft PR for twenty-first that provides iterators for the DB types. This would enable usage of the get many/all set-based semantics in this PR without blowing up memory for large data sets. So that is a possibility as well and would work best with the concurrent version of leveldb.
Finally, bolded on the rusty-leveldb repo index page:
one last clarification: I find this code a bit cleaner and more readable than what came before, so I hope we can merge it, or something close if/when the above issues are resolved.
commit messages follow:
Author: danda dan-da@users.noreply.github.com Date: Fri Nov 17 21:01:57 2023 -0800
Author: sword-smith thor@neptune.cash Date: Thu Nov 9 21:57:36 2023 +0100