Blosc / bloscpack

Command line interface to and serialization format for Blosc
BSD 3-Clause "New" or "Revised" License
122 stars 27 forks source link

CPU core use #96

Open victor987 opened 4 years ago

victor987 commented 4 years ago

It seems it's not detecting/using the available CPU cores.

Testing on Ubuntu server 19.04 on quadruple AMD Opteron 6282 SE, for a total of 64 cores on the system. I'm using incompresible random data for this example but the behaviour is the same with real data.

$ sudo apt install bloscpack
$ blpk --version
bloscpack: '0.15.0' python-blosc: '1.7.0' blosc: '1.15.1'
$ dd if=/dev/urandom of=test bs=1M count=10k
$ blpk -v compress test                                                                                              
blpk: using 64 threads
blpk: getting ready for compression
blpk: input file is: 'test'
blpk: output file is: 'test.blp'
blpk: input file size: 10.0G (10737418240B)
blpk: nchunks: 10240
blpk: chunk_size: 1.0M (1048576B)
blpk: last_chunk_size: 1.0M (1048576B)
blpk: output file size: 10.0G (10738524192B)
blpk: compression ratio: 0.999897
blpk: done

Activity during compression shows that 4 cores out of 64 are used:

Screenshot from 2019-09-19 23-47-10

Specifying 64 threads does not change the behaviour:

$ blpk -v -n 64 compress test                                                                                      
blpk: using 64 threads
blpk: getting ready for compression
blpk: input file is: 'test'
blpk: output file is: 'test.blp'
blpk: input file size: 10.0G (10737418240B)
blpk: nchunks: 10240
blpk: chunk_size: 1.0M (1048576B)
blpk: last_chunk_size: 1.0M (1048576B)
blpk: output file size: 10.0G (10738524192B)
blpk: compression ratio: 0.999897
blpk: done

Activity during compression shows that 4 cores out of 64 are used:

Screenshot from 2019-09-19 23-54-14

esc commented 3 years ago

@victor987 thank you for bringing this to our attention. I just looked at this and the problem may very well be worse. Regardless of what I try, I can only get bloscpack to run in single threaded mode. My guess is that this is an issue with the underlying python-blosc but more triage will be needed.

esc commented 3 years ago

My single-threaded observations may be OSX only. I just tried on a Linux server with 8 cores and can use --nthreads w/o issues.

esc commented 3 years ago

Oh, wait it seems like the default number of threads is now 8(?) and I am unable to change that to anything else. Always 8 threads will be used.