Hi Alex,

Thanks again for your amazing work and for making this library available as an open-source product.

As part of my work on a forensic parser for Microsoft Teams, I have noticed that the way the metadata is fetched for the IndexedDB could be improved. The fact that iterate_records_raw() is called three times, even though the database did not change in the meantime, makes it quite slow. By unifying the collection of the metadata, I was able to significantly reduce the time needed to loop through a large database.

Benchmark

As a benchmark I was looping over the following IndexedDB (contains several object stores and records) using the included benchmark.py script.

https://github.com/lxndrblz/forensicsim/tree/main/testdata/John%20Doe/IndexedDB/https_teams.microsoft.com_0.indexeddb.leveldb

I got 432 seconds before the optimisation and 265 seconds after my optimisation.

Let me know what you think.

Alex

cclgroupltd / ccl_chromium_reader

feat: improve metadata collection of IndexedDB #11

Benchmark