ckolivas / lrzip

Long Range Zip
http://lrzip.kolivas.org
GNU General Public License v2.0
617 stars 76 forks source link

CPU detection does not account for CPU affinity #252

Open mgorny opened 7 months ago

mgorny commented 7 months ago
$ taskset -c 0-5 nproc
6
$ taskset -c 0-5 lrzip -vv aes.c
The following options are in effect for this COMPRESSION.
Threading is ENABLED. Number of CPUs detected: 12
Detected 33571696640 bytes ram
Compression level 7
Nice Value: 19
Show Progress
Max Verbose
Temporary Directory set as: ./
Compression mode is: LZMA. LZ4 Compressibility testing enabled
Heuristically Computed Compression Window: 213 = 21300MB
Storage time in seconds 1398224757
Output filename is: aes.c.lrz
File size: 16531
Succeeded in testing 16531 sized mmap for rzip pre-processing
Will take 1 pass
Chunk size: 16531
Byte width: 2
Warning, low memory for chosen compression settings
Unable to allocate enough memory for operation
Failed to open streams in rzip_chunk
Deleting broken file aes.c.lrz
Fatal error - exiting

Noticed this while trying to work around #249. FWICS nproc(1) uses sched_getaffinity() to get the correct value for the number of CPUs available to the process.

pete4abw commented 7 months ago

Just use lrzip -p# to restrain CPUs. lrzip is a standalone monolithic program that manages its own environment.

mgorny commented 7 months ago

That doesn't help when you can't control what options are passed to lrzip, i.e. when it's run implicitly by libarchive.

pete4abw commented 7 months ago

News to me. libarchive data formats.

lrzip no longer has a library interface. If you use tar -I 'lrzip -options...' -cf file.tar.lrz you can pass any options you want. Even withlrztar`. Again, lrzip is best applied as a standalone program. Anything else...YMMV.

I certainly would not take time for this. Patches always encouraged.

mgorny commented 7 months ago

It supports it via calling lrzip executable (https://github.com/search?q=repo%3Alibarchive%2Flibarchive%20lrzip&type=code).

pete4abw commented 7 months ago

It supports it via calling lrzip executable (https://github.com/search?q=repo%3Alibarchive%2Flibarchive%20lrzip&type=code).

Then the issue will be with libarchive to allow for parameters. You do, however, point out a deficiency in the lrzip.conf file. It does not have a parameter for setting the number of CPUs, the -p# option. Were there such an option, such as:

PROCESSORS = ##

lrzip/lrzip-next will force that on the program (subject to command line override). As I no longer contribute to lrzip, I will look at this possibility in my own fork. Good luck.

pete4abw commented 7 months ago

@mgorny This will help. I implemented this separately. Just add the line PROCESSORS = ## to your lrzip.conf file. I never understood why more people do not use it!

$ git diff util.c
diff --git a/util.c b/util.c
index f643303..c17f35e 100644
--- a/util.c
+++ b/util.c
@@ -265,6 +265,12 @@ bool read_config(rzip_control *control)
                        /* default is yes */
                        if (isparameter(parametervalue, "no"))
                                control->flags &= ~FLAG_THRESHOLD;
+               } else if (isparameter(parameter, "processors")) {
+                       control->threads = atoi(parametervalue);
+                       if (control->threads < 1) {
+                               print_err("CONF.FILE error. PROCESSORS value must be >= 1. Resetting to default. %d\n", PROCESSORS);
+                               control->threads = PROCESSORS;
+                       }
                } else if (isparameter(parameter, "hashcheck")) {
                        if (isparameter(parametervalue, "yes")) {
                                control->flags |= FLAG_CHECK;
mgorny commented 7 months ago

Thanks, but I don't think we're going to patch lrzip locally in Gentoo. Given the state of the project, perhaps a better option would be to remove it from Gentoo entirely.

ckolivas commented 7 months ago

Unfortunately threatening me will not magically give me more time to work on this project. If you wish to remove it from your distribution based on this use case that is entirely your prerogative. The project is sorely lacking in people who understand the code well enough to contribute only bug fixes to address outstanding issues, as most are interested in adding features instead. I will eventually get around to addressing known issues as I do at infrequent intervals but that is likely still months away. Whilst Peter's patch will give you the ability to control processors in the configuration file and is a worthwhile addition in its own right, it does not address your issue. Patches are always welcome.