Open aldanor opened 8 years ago
Interesting, thanks. I don't have time to look right now, but I wonder if you made object
move-only where the compiler errors out. That will tell you where the copies are being made.
That wouldn't compile because yield_fn
requires SemiRegular
:
range-v3/include/range/v3/view/for_each.hpp:110:17: error: static_assert failed
"The object passed to yield must be a model of the
SemiRegular concept; that is, it needs to be default
constructible, copy and move constructible, and destructible."
CONCEPT_ASSERT_MSG(SemiRegular<Val>(),
OK, right, because view::single
holds an object
by value, and views must themselves be SemiRegular. That doesn't explain where the copies are coming from during iteration. I'll look into it.
That doesn't explain where the copies are coming from during iteration.
single
caches in its iterators.
I have a half-finished change that addresses this and the SemiRegular
issue, FWIW.
IIRC, I did that because it vastly improved the generated code for range compositions. We really need a suite of benchmarks.
IIRC, I did that because it vastly improved the generated code for range compositions.
Yes, that is my belief as well. Avoiding pointers has incredible effects on code generation for tight loops - it makes alias analysis easy when there are no pointers.
The repeat view in cmcstl2 uses a heuristic: if the objects are trivially copyable and "small" (<= 4 pointers, chosen completely arbitrarily) I copy them everywhere. This is one of those ideas I intend to investigate in detail "some day."
This is more of a question rather than an issue (so please feel free to close if it's irrelevant) -- I've seen a few recent open issues regarding yield/for_each and decided to do a bit of testing to check the copy/move ctor call counts in the final optimized binary.
Compiled with gcc 6, c++14, -O3, this yields:
The move ctors I can live with (although 6 moves per element is quite a a lot), but 1-3 copy ctors per iteration is completely prohibitive for anything but the simplest pod types like integers. E.g., in the second example we yield an rvalue, it gets moved 6 times but apparently also copied twice per iteration?
Is there any way with the current implementation to force the copy-ctor count to zero, at least in trivial examples like this (other than ugly hacks like yielding pointers), or is this a design limitation?
Thanks 😉