syzygy1 / tb

GNU General Public License v2.0
221 stars 50 forks source link

Reference: where it backfires with too many threads #26

Closed noobpwnftw closed 6 years ago

noobpwnftw commented 6 years ago

I did some performance measurements and these are verified that less threads run faster, without -d option.

noobpwnftw commented 6 years ago

Also including transform in reduction. These procedures are generally less time consuming, usually 1-5 minutes after limiting threads.

syzygy1 commented 6 years ago

When called with a NUMA work list, transform should be OK on many threads. I will try to make the compression NUMA aware where easily possible and optionally allow lowering the number of threads for all other cases.

noobpwnftw commented 6 years ago

Maybe because they are too fast on the CPU side.

noobpwnftw commented 6 years ago

For example, the fix_closs_worker runs for 6 seconds on 64 threads, 5 minutes on 384 threads, NUMA aware. I think with the new Xeon generation, the models with "M" are for their worth.

syzygy1 commented 6 years ago

6 seconds versus 5 minutes? oops...

noobpwnftw commented 6 years ago

Yup, when there is congestion, the backfire is really annoying.