Closed roblatham00 closed 6 years ago
Hi @roblatham00, please try the btree
engine in place of kvtree2
, and I bet this will work.
kvtree2
engine doesn't implement Each (as you've seen), but supports arbitrary large valuesbtree
engine implements Each -- but only supports small values (<500 bytes) by default, however you can recompile with a larger limitkvtree3
engine that supports Each and large values, and this will take over as our new default enginePlease post back if you need any more help getting this running, happy to help!
Thanks, RobD
Switched to btree but now I get a bad_allocation
exception from make_persistent_atomic
when storing the 64th word. I am only storing an empty string along with the keys so I don't think i'm hitting any 500 byte limit.
or did you mean btree only supports 500 bytes worth of anything?
Hmm, please make sure you delete your persistent pool that's already around. I suspect this is due to opening the kvtree2
binary format using the btree
engine, which is not something we actively prevent at this time.
With btree
the values are limited to 500 bytes each, so you should be fine there.
File is definitely zapped. I am faking this with a plain old file on a file system, so perhaps that is complicating things.
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc
zsh: abort (core dumped) ./writer kv.db words
Oh duh, it's the pool filling up. Increase PMEMOBJ_MIN_POOL to a larger value. :smile:
#define PMEMOBJ_MIN_POOL ((size_t)(1024 * 1024 * 8)) /* 8 MiB */
Just so I understand clearly: each key/value pair in pmemkv's btree engine consumes 128 KiB ?
I doubled the pool size and put values in a loop until breaking out once std::bad_alloc is caught. Reader reads back 100 keys now.
thanks for the help.
I don't think btree
will be very space-efficient for this case, it's not variable-length so it's always going to write out a full entry even for an empty value. That said, I agree with you that space usage looks rather high here for the number of keys. I'm curious to dig into this a bit more and understand this better.
Anyway, great to hear that things are working, and thanks for trying out the API!
Oh, I also wanted to mention -- using plain old files is totally ok for prototyping, but a few tips if you're going to do any benchmarking with larger pools:
PMEM_IS_PMEM_FORCE=1
in your environment so PMDK treats your file like persistent memoryPMEM_NO_FLUSH=1
to speed up writes by skipping all the flushes that are normally done for strict consistencyThanks!
Hey @roblatham00, closing out this issue with a few follow-up notes:
kvtree3
engine supports the Each operation, so you don't have to use the more experimental btree
engine for this kind of test.btree
engine for the core dump that you saw during this testing when the engine ran out of space (#126)btree
engine is pretty bad in its current form, but even the kvtree
engine is showing low storage efficiency for very small values as you were testing in this case. Storage efficiency also drops as the size of the pool is reduced. It's fair to admit that much of our testing centers on larger pools (>1GB) and larger values (800 bytes is typical for our benchmarks). I'm curious if we can add an optimization to improve storage efficiency for small values. (#127)
Wanted to get famliar with pmemkv api so wrote a writer and reader pair. The writer reads words out of a dictionary (like
/usr/share/dict/words
) and stores them in pmemkv. then the reader tries to read them back. Except my reader isn't reporting anything:Writer:
reader: