PR #23 fixed a potential deadlock in the allocator by making initialize_lists conservative (prevents the scheduler from stealing while waiting at the join point).
For some reason, this causes the performance of remove_duplicates to worsen by up to 30% in the benchmarks. Most other functions are unaffected or only a few percent different.
Reintroducing the deadlock is not desirable, but this performance regression should be investigated.
I miscalculated, this regression occurred in PR #22. The hash function for integers, used in many of the grouping and deduping algorithms became more expensive.
PR #23 fixed a potential deadlock in the allocator by makinginitialize_lists
conservative (prevents the scheduler from stealing while waiting at the join point).https://github.com/cmuparlay/parlaylib/blob/f53b6e30fbf0a7466017e2fadb28a2f9794369e9/include/parlay/internal/block_allocator.h#L88
For some reason, this causes the performance ofremove_duplicates
to worsen by up to 30% in the benchmarks. Most other functions are unaffected or only a few percent different.Reintroducing the deadlock is not desirable, but this performance regression should be investigated.I miscalculated, this regression occurred in PR #22. The hash function for integers, used in many of the grouping and deduping algorithms became more expensive.