Closed zeux closed 2 years ago
This sounds great - thanks for submitting this. Will merge after validating (it's a small-ish looking change so that should be simple).
Looks like this was fixed as part of https://github.com/BinomialLLC/basis_universal/commit/deeb5acb563246f9b747229636205c5b19b99839#diff-a01c185dbfc142f2a0b5c67b814a556de24466173732064baa30ca51b3a35b97 so I'll close this.
Cool, thanks
Before this change we filled block->cluster mapping in parallel but also tried to aggregate list of blocks for each cluster in the same job.
In a multithreaded run on a Ryzen 5900 system (24 threads) this resulted in significant contention over the mutex; moving the distribution to run after all threads finished (serially) brought total compression time down from 6.5 to 5.5 seconds.