Open francisdb opened 1 year ago
Looks like the write path is quite affected by opt-level
for tests. opt-level = 1
already yields much better results.
But I wonder if some things could still be improved, eg constant time for writing new streams.
Thanks for the report. There's probably a lot of low-hanging fruit for improving performance; most of the work on this crate so far has been focused on trying to achieve correctness. PRs for improvements would be welcome, especially if they come with benchmark tests. (That said, my inclination is not to worry too much about performance in debug builds. If performance is bad even for release builds, then that's more worth addressing.)
I haven't had a chance to dig into this specific issue, but one possible cause of this is that this implementation doesn't currently keep the directory entry binary search tree balanced (fixing that has been on my to-do list for a long time). So adding a lot of sibling nodes will eventually create a long search chain. I wouldn't expect this would make all that much difference for only 100 entries, but maybe in debug builds it does. Or maybe there's another performance issue I'm not thinking of.
I'm working on a lib that reads vpx files (Visual Pinball). See https://github.com/francisdb/vpxtool
These files are easily 200MB and contain hundreds of streams of up to 10MB
Reading the whole file goes reasonably fast but once you start writing them things slow down considerably. Even in memory. I wonder if I am doing something wrong.
Reading a typical file takes about 1 second. Writing it out again takes 20 seconds.
Test code and output below. You can play with the range and the data size.
logs