Turn off most of synchronization logic for non-parallel phases
Make out Threadlocals eager and improve their perfromance (and create Lazy version as well)
Change the way how most common threadLocals are applied (so basically remove creating of 400k them)
Create specielized ThreadLocal for ints
It boosts perfomance from 28,374 (build 329) to 23,702 (build 328). Overhead is still quite bing (non-parallel run is 20,121) ~17,79%.
Performance of refcheck is slightly better in this PR (0,929 vs. 0,906) where on non parallel we've get 1,248, so it means ~27,4% speedup. When running refchecks on this PR but in non parallel we get 1,367 so relative speedup of parallelization is 33,7%.
By:
It boosts perfomance from 28,374 (build 329) to 23,702 (build 328). Overhead is still quite bing (non-parallel run is 20,121) ~17,79%.
Performance of refcheck is slightly better in this PR (0,929 vs. 0,906) where on non parallel we've get 1,248, so it means ~27,4% speedup. When running refchecks on this PR but in non parallel we get 1,367 so relative speedup of parallelization is 33,7%.