AgentD / squashfs-tools-ng

A new set of tools and libraries for working with SquashFS images
Other
194 stars 30 forks source link

Performance improvements #69

Closed AgentD closed 5 months ago

AgentD commented 3 years ago

A lot of code was written in a "keep it simple, make it work, worry about performance/efficiency later (but keep it in mind and design the data structures flexible & opaque enough)" mentality.

As a result some areas could stand improvement:

AgentD commented 2 years ago

Some initial investigation shows that an LRU fragment block cache in the data reader can shave off some ~10% from unpacking an image. A random-access benchmark would be reasonable.

mattgodbolt commented 1 year ago

using linear search in an append-only array of block size & hash. This could be improved by organizing the on-disk block list as a trie/prefix tree.

This one is particularly bad for a use case I'm investigating right now, with large archives. The gensquashfs gets slower and slower as the archive proceeds, until it's almost at a crawl towards the end. A trie would be ideal of course, but even a hash map of hash to first block with that ID, and a linked list of blocks with the same hash could work.

mattgodbolt commented 1 year ago

In fairness I now realise it could be other areas of the code. Just from reading the source I figured this could explain the apparently O(n²) feel of the slowdown.

AgentD commented 1 year ago

I've started looking into the block processor performance during the weekend.

The approach I'm currently looking into is based on using a hash table that maps the block checksums to the blk_info_t array. Then, during de-duplication, swap out the key-equals function to compare the entire cluster and associated data if it matches, and remember the candidate with the lowest index.

I'm still ironing out some bugs at the moment, as well as cobbling together a benchmark. How/if it improves performance remains to be seen.

thesamesam commented 1 year ago

FWIW, the next release of xz (5.4.0, it's in alpha right now, planned for early December or so) supports parallel decompression: