Open Devjiu opened 1 year ago
Why is using tbb better than adjusting the worker count by some metric similar to the granularity parameter you introduced? When I did performance characterization with tbb, the cost of creating the tbb context and task group was higher than simply launching async threads.
Why is using tbb better than adjusting the worker count by some metric similar to the granularity parameter you introduced? When I did performance characterization with tbb, the cost of creating the tbb context and task group was higher than simply launching async threads.
As far as I understand, tbb here should check thread pool and run slices of rows not less than granularity. In case of using raw thread vector and launching it there is an danger of system_error due to system resources. We can try here catch and repeat, but I decided, that tbb will be better here.
This commit resolves segfault in H20 benchmark on 1e9 data. Currently a system_error occurs during checksum calculation. According to cppref: https://en.cppreference.com/w/cpp/thread/async. System error is thrown if there are not enough resources to create a new thread:
To avoid this situation, I rewrote this method to use tbb, which should automatically check for available threads in the thread pool.