oneapi-src / oneTBB

oneAPI Threading Building Blocks (oneTBB)
https://oneapi-src.github.io/oneTBB/
Apache License 2.0
5.6k stars 1.01k forks source link

[Feature Request] Add aligned_blocked_range #48

Open Lastique opened 6 years ago

Lastique commented 6 years ago

It would be useful to have an aligned version of blocked_range:

template< typename Iterator, std::size_t Alignment >
class aligned_blocked_range;

The idea is that the aligned_blocked_range should divide the range on boundaries that are a multiple of Alignment from the beginning of the range. It should still provide support for grain size, which should be not less than Alignment and should be rounded up to the nearest multiple of Alignment. The range is splittable if its size is at least 2 grain sizes (i.e. the split must never produce a range that is less than Alignment).

The existing blocked_range doesn't provide that guarantee because even if you specify an aligned grain size because it will split the range in halves and thus will lose alignment if the range size is not exactly an even multiple of the grain size. Also, the current blocked_range split can produce a range that is less than the grain size.

The aligned_blocked_range is useful when the input range is already aligned, and threaded processing should also preferably be aligned. For example, in image processing, the image rows might be cache line aligned, and each chunk of row suitable for threaded processing should also be aligned to a cache line to avoid false sharing effects.

anton-malakhov commented 6 years ago

+1 This is very desired extension which simplifies writing high-performance SIMD code.

arunparkugan commented 1 month ago

@anton-malakhov is this issue still relevant?

anton-malakhov commented 1 month ago

yes