plougher / squashfs-tools

tools to create and extract Squashfs filesystems
GNU General Public License v2.0
752 stars 193 forks source link

Feature request: Parallel file reading #239

Open nh2 opened 1 year ago

nh2 commented 1 year ago

Currently mksquashfs seems to use a single reader thread.

Many current devices only achieve optimal throughput when files are read from them in parallel:

Could mksquashfs add (configurable) threaded reading?

Thanks!

plougher commented 1 year ago

This is an interesting request (the second in one week). Back when I parallelised Mksquashfs for the first time in about 2006 I did extensive experiments reading the source filesystem using one thread and multiple threads. These experiments showed the maximum performance was obtained with a single read thread (and so you're right that there is only one reader thread). But this was in the days of mechanical hard drives with slow seeking, and the results were not that surprising. By and large anything which caused seeking (including parallel reading of files) produced worse performance.

Modern hardware including RAID (*) and SSD drives may have changed the situation. So I'll add this to the list of enhancements and see if priorities allow it to be looked at for the next release.

(*) RAID has been around since the late 1980s. In fact I implemented a block striping RAID system in 1991. But they have become more and more widespread in recent years.

As far as RAID is concerned I assume these systems are using block striping rather than bit-striping otherwise there should not be an issue. Also as readahead should kick in for large files utilising all the disks with block striping, I assume the issue is with small files which do not benefit from readahead.