ericniebler / range-v3

Range library for C++14/17/20, basis for C++20's std::ranges
Other
4.06k stars 437 forks source link

Performance issue : concatenating two continuous ranges #1697

Open pfeatherstone opened 2 years ago

pfeatherstone commented 2 years ago

I'm experimenting with range-v3 to figure out if it is truly a zero-cost abstraction and can make my life easier. In the following example, I compare range-v3 with vanilla STL in computing dot products (forward and reverse order) on a circular buffer / ring buffer. The circular buffer is never rotated, instead we increment the write-position offset, modulo buffer size. This means the range consists of two contiguous ranges. The vanilla STL implementation performs the dot product on the cyclic buffer as two sub dot products. The range version concatenates two sub-ranges and performs the dot product on the resulting view.

The godbold link will show that the STL version is at least 10X faster. Running locally on my INTEL machine, it can be upward of 100X faster. So what is going on? Am I misusing range-v3? Is range-v3 not able to figure out that the underlying range consists of two contiguous ranges and therefore cannot use the correct iterator type on each subrange ? Something else? Any guidance or comments would be appreciated.

pfeatherstone commented 2 years ago

https://github.com/ericniebler/range-v3/issues/1081 seems related

pfeatherstone commented 1 year ago

@ericniebler Do you have an opinion on this? Is there a way for ranges to compile to optimal code, i.e. yielding similar performance to a more manual approach using STL?