Closed JCalvi closed 1 month ago
I just ran some tests with xz
:
root@VM-Bookworm:~# fallocate -l 1G test
root@VM-Bookworm:~# xz -T0 test
xz: Reduced the number of threads from 4 to 2 to not exceed the memory usage limit of 492 MiB
root@VM-Bookworm:~# xz -d test.xz
root@VM-Bookworm:~# xz -T4 test
This was with 2 GiB memory allocated to the VM. After raising it to 4 GiB, -T0
used 4 threads and no such error message.
Checking the docs, there is indeed a default limit for multi-threaded decompression, and compression with -T0
, which seems to be 25% of the physical RAM size:
root@VM-Bookworm:~# xz --info-memory
Hardware information:
Amount of physical memory (RAM): 3913 MiB (4102864896 B)
Number of processor threads: 4
Memory usage limits:
Compression: Disabled
Decompression: Disabled
Multi-threaded decompression: 979 MiB (1025716224 B)
Default for -T0: 979 MiB (1025716224 B)
And with 2 GiB indeed it is 492 MiB:
root@VM-Bookworm:~# xz --info-memory
Hardware information:
Amount of physical memory (RAM): 1967 MiB (2062090240 B)
Number of processor threads: 4
Memory usage limits:
Compression: Disabled
Decompression: Disabled
Multi-threaded decompression: 492 MiB (515522560 B)
Default for -T0: 492 MiB (515522560 B)
This can be raised, e.g. -T0 -M75%
to allow up to 75% physical RAM size before lowering the number of threads. Not sure which one is reasonable? 50% minimum should be perfectly fine.
We actually limited it ourselves in case of higher CPU core number and lower RAM, but on a much higher level than what xz
does by itself. This was needed when we used p7zip
and 7zip archives before, which ran into OOM on systems with lower RAM but multiple cores, like an RPi 2 with 4 cores but only 1 GiB RAM. But since xz
does this automatically, even much stricter, I removed our obsolete handling and raised the limit of xz
to 50%: https://github.com/MichaIng/DietPi/commit/3516a6f
Let me know if you think an even higher limit would be fine.
I would say an even higher limit would be fine myself. With 4096mb and -T4 set manually there were no issues running imager. I had before posting tried the VM with 8192mb ram and still got only 1 cpu at the xz stage though. I will test again with even more ram to confirm that -T0 does start using threads.
Remember that you need to take the dev
branch version of the imager script, until DietPi v9.8 has been released.
Indeed I did not test yet how much memory the compression of a typical image even requires. When you test, please also check whether -T4
starts to use swap space with 4 GB RAM. Since in that case, it might be even faster to use 2 threads only without swapping.
Just tested dev branch with 8192mb on VirtualBox. Got 3 threads utilised, no swap file and plenty of memory left. I can easily up the VM to more ram but i think you could afford to be less conservative.
G_EXEC_DESC='Creating final xz archive' G_EXEC xz -9e -T0 -M50% -k "$OUTPUT_IMG_NAME.$OUTPUT_IMG_EXT"
50%
G_EXEC_DESC='Creating final xz archive' G_EXEC xz -9e -T0 -M75% -k "$OUTPUT_IMG_NAME.$OUTPUT_IMG_EXT"
Was the RAM usage still raising in both cases? Because at time of screenshot, it was not even close to the 50%/75% yet. I'll raise it however to 75%.
It had peaked out in both cases. It seems to be conservative in its definition of percentage as well.
I added 2 more cpu's and it still only uses 4 at 75% even though only 50% of the ram is actually committed. So 75% is still a quite safe and conservative setting.
Here is one at -T95% and 6 cpus allocated with 8192mb ram.
75% should be super safe for all users. The only issue I could foresee is if much larger images need more ram (I've only tested up to 10gb ones). I suspect xz drops cpu's if the ram limit is approached anyway.
Yeah, seems fine then. And yes, the larger the image size, the higher the RAM usage, and it would:
in that order/priority to meet the 75% limit.
Thanks Michalng,
great work as always.
Dietpi 9.7.1 Running on latest VirtualBox on Windows 11 with latest extensions.
Dietpi-imager works great except for the xz compression which only uses 1 cpu. If I force the threads in the script to -T2 or -T4 etc then the appropriate number of cpu's is used.
The cpu command in the guest returns...
[WARNING] Most CPU info is not available on virtual machines. Architecture | x86_64 Temperature | N/A Governor | N/A
htop shows 4 cpu's and as mentioned all are used if local threads=0 is changed to local threads=4 in dietpi-imager script.
Is this expected behaviour? Do guest extensions need to be installed to get xz to detect the correct threads for -T0 setting?
Otherwise great script, works really well to create small and flexible images.