Closed salortiz closed 3 years ago
You can't really have different shared structures settings for writing with msgpackr and reading. However, you can make your migration script work by setting a different decoder with the previous setting, without a shared structure:
conf.sharedStructuresKey = Buffer.from('structs'); // Add structs to conf
let store = open('mydata', conf)
store.decoder = new Decoder() // import Decoder from msgpackr package and use it without shared structures
let iter = store.getRange({
... do migration
store.decoder = store.encoder // if you want to restore the decoder that uses that shared structure after migration
@kriszyp
I don't need to have different shared structures (SS) for reading and writing.
During the migration and after it I will use one and only one.
The records created originally (without SS) where writing as simple entries (without the SS related extensions), so I expected that, during the migration, when the iterator reads (decode) them, the machinery added by sharedStructuresKey
wasn't used, but becomes created and used at the first (and subsequent) store.put
s (encode), overwriting the original value with the new encoding (i.e. with the msgpack's SS extension type).
My tests shows that the store.decoder
created with the SS active, can read the original values without problems.
And all writes will use the same store.encoder
, So, I don't understand where is the problem.
I already workaround my problem reading all data into memory (and removing it from the store) in the iterator loop, and then writing all in one go, in a separate loop, but want to report the issue because it violated my expectations and no errors were reported during the migration, the error appears later (in a separate run), using the same sharedStructuresKey
, so something becomes corrupted during the migration overwrites.
Thanks for your attention.
The issue is that the records were written with shared structures disabled, and in the migration step, they were read with shared structures enabled (which is a different setting). This reason this causes a problem is that with shared structures disabled, the structures are written within each entry/document (they are all assumed to be "private" structures), but when msgpackr starts reading and writing with shared structures enabled, the private structures are read and override the slots of the structures that are supposed shared structures (using the same structure ids), thereby "corrupting" them. It would be possible to add extra checks for this type of thing, but generally msgpackr is written to optimize for performance, and reading an MessagePack document that was written with a different shared structure setting wouldn't really be supported anyway.
I wrongly assumed that the shared structures would use a different set of tags. Now I have it clear. Thank you very much for the explanation and this great piece of code.
In a database using
msgpack
as encoding, I suspected that it would be enough to add thesharedStructures
option with its key and, through an iterator (getRange), a simple "put" with the same key would suffice to compact all records.The process ran smoothly, but a subsequent access to the database resulted in the aforementioned error and/or mangled key-values.
I'm using node v14.17.0,
lmdb-store
v1.5.2 andmsgpackr
v1.3.3 on Fedora 33.A little script to reproduce the problem with sample data available in https://gist.github.com/salortiz/a4122bae442d02bdeda807ce547d15c4