Open jsrawan-mobo opened 11 years ago
@jsrawan-mobo Thanks for taking the time to investigate this!
I am painfully aware of the sub-optimal performance. I have been tracking it under issue#1, but haven't really found the motivation to fix it yet.
It seems you have made some fixes/enhancements. Did you miss a pull request? Can you point me where you have made these fixes?
See Pull Request #24.
It not completely done, but you can try and see the performance improvement by skipping past the lzf_decompress() and storing the index to a deep dump later.
If you like where its headed, i can cleanup and do a proper pull request.
Have you been able to improve it ? Would it be possible to realease it ? As for huge DB (about 50Go / 1 Millions keys) on very faster server it takes like half a day as it's monothread.
Thanks, Alex
I hadn't looked at this in a few years, seems like this project went stale. The pull request I put up does work in quick mode like this if you want to give it a try
1) Generate a quick memory dump and index. In quick mode, only compressed_size is valid. rdb.py -c memory -q --file redis_memory_quick.csv redis.rdb
2) After viewing, dump a hash/list to view contents of a offending key rdb.py -c memory --max 1 --pos 3568796958 -v --key mongow --file redis_memory_mongow.csv redis.rdb
I'd be willing to fix this up if someone finds use for it, or fork the repo.
For very large RDB, the memory dump can take upwards of 30 minutes. Even slower, the "key" feature requires a sequential scan over the whole file.
Finally trying to further introspect a data structure like a hash, list, set to find out which field is taking up the most memory. In my case I use celery as worker queue, and some tasks can be gigantic.
So I've made some enhancements such as the following i) Reduce time to about 5 minutes to dump in quick mode ii) Allow re-seeking for key contents in seconds, and limit mode iii) Allow for verbose dumping of hash/list/set to file structure