[Async] Change std::async to tbb

Devjiu commented 1 year ago

This commit resolves segfault in H20 benchmark on 1e9 data. Currently a system_error occurs during checksum calculation. According to cppref: https://en.cppreference.com/w/cpp/thread/async. System error is thrown if there are not enough resources to create a new thread:

std::system_error with error condition std::errc::resource_unavailable_try_again, if
policy == std::launch::async and the implementation is unable to start a new thread.

If policy is std::launch::async | std::launch::deferred or has additional bits set,
it will fall back to deferred invocation or the implementation-defined policies in this case.

To avoid this situation, I rewrote this method to use tbb, which should automatically check for available threads in the thread pool.

alexbaden commented 1 year ago

Why is using tbb better than adjusting the worker count by some metric similar to the granularity parameter you introduced? When I did performance characterization with tbb, the cost of creating the tbb context and task group was higher than simply launching async threads.

Devjiu commented 1 year ago

Why is using tbb better than adjusting the worker count by some metric similar to the granularity parameter you introduced? When I did performance characterization with tbb, the cost of creating the tbb context and task group was higher than simply launching async threads.

As far as I understand, tbb here should check thread pool and run slices of rows not less than granularity. In case of using raw thread vector and launching it there is an danger of system_error due to system resources. We can try here catch and repeat, but I decided, that tbb will be better here.

intel / hdk

[Async] Change std::async to tbb #705