Closed AgentD closed 4 years ago
UPDATE
The problems regarding memory usage turned out to be a fluke. My test image from a Fedora LiveCD managed to trigger a bug that has since been fixed. I continued testing with a Debian CD, since it actually contains a file system (the Fedora one contains only an ext4 image).
Using 4 jobs, the parallel unpacker in rdsquashfs manages to extract an entire 2GiB Debian LiveCD image in around 3 minutes on my laptop (half that time on a more powerfull Xeon test server I have access to), both using a plain old hard drive containing input image and output file tree and both aren't exactly top-of-the-line hardware anymore. IMO that time is okay-ish. Memory usage turns out to be negligible.
rdsquashfs could still benefit from only extracting the part of the tree we are interested in. Running rdsquashfs -l /
on the Debian image chokes for a little over second on my laptop before spitting out the directory listing.
Currently, operations are performed sequentially:
The "generate tree in memory first, then do something with it" works great for small filesystems, but something like unpacking a livecd image with rdsquashfs is practically impossible, filling up all RAM in the process. This will also be a problem when trying to generate the image with gensquashfs.
The steps need to be interspersed to reduce memory consumption, essentially eliminating the in-memory tree to the extend possible.