threefoldtech / 0-db

Fast write ahead persistent redis protocol key-value store
Apache License 2.0
39 stars 10 forks source link

zdbd: allow large data chunks transfert for clone #159

Open maxux opened 10 months ago

maxux commented 10 months ago

In order to fast clone one namespace to another instance of zdb, a specific command which would transfert full chunk of data file in one shot would be really important to benefit of full line speed.

There is already DATA RAW command which fetch a specific entry based on offset, but that's inefficient when there are lot of small entries.

This command would be only for administrators obviously, since it could leak data and slow down process.

maxux commented 8 months ago

Implementation started on branch development-v2-data-segment. There is a working version already which can export and import part of data. On local database.

This feature allowed me to clone a full namespace of 31G (locally) in 1 min 58 seconds without any external tools.

There is two new command (only available for administrator):

When doing EXPORT, zdb is sending a 4 MB chunk of data_id:data_offset in one shot to client. Client can't choose the chunk size, the 4 MB is a hardcoded size which seems good to avoid locking zdb, take benefit of line bandwidth, is below any hard limit set on redis protocol level and doesn't consume lot of memory.

Import works the same way except that you can only import to the current (last) data_id, you can't import an already closed (immuable) datafile. In addition, this feature is only allowed on frozen namespace to avoid any side changes. This feature is designed to clone a namespace from scratch, this feature can't be used to clone a similar namespace if data are not exactly the same.

Workflow when importing:

There is a script which does that already in place: tools/export-import/eximport.py

Next step is getting the index ready. Best solution in my opinion to achieve that is implementing an INDEX REBUILD based on data files, so index can be created from scratch from data file. There is an issue already talking about that, that would be nice #160.