Closed e0e1e613-dccb-47a5-85db-3dc989429265 closed 2 years ago
Hi!
Is there any chance on landing some form of this? It gives me a pretty nice speedup.
BTW, I think the mechanism that we use here can help us to address the fact that back_inserter is 10 times slower.
http://quick-bench.com/8JSAAEXdeRX7M2hp1iRGKlq9_XI
Of course, you might say that you should not construct with back inserter, but in some places, you cannot avoid that (like in algorithms).
I just realized, I haven't attached my test files. Sorry about that.
There are a lot of changes, because it's quite hard to implement this staff with traditional SFINAE, I had to redo everything relying on dead code elimination.
Great! Thx for doing this!
Wow, that's a lot of code churn for a small change. I'll put some concrete comments on the phab link.
Ping?
Ping, just in case. I know you were busy with CppCon and stuff.
Best, Denis.
I went ahead and created a patch: https://reviews.llvm.org/D38653 Please tell me what you think, when you have the time.
This turned out to be harder than I initially suspected.
I've created a possible solution in the compiler explorer: https://godbolt.org/g/xJH5dA (I wanted to confirm that everything optimizes away). Before I touch libc++, can you please take a look to see if my idea is generally ok?
Another thing that bugs me - is testing. Is there a good way to see if loop or memmove is called without hacking the functions themselves?
In the end, how much do you mind me fixing this?
I don't mind you exploring this at all. In fact, I appreciate it! Once you're ready to share the results, then we can decide if it's worth incorporating it into libc++.
BTW, your godbolt link assumes that the source and dest iterator types are the same. In general, that's not true for std::copy.
I'm already halfway done fixing this, however there is a compiler bug that makes unwrapping more complicated than it should be (it's not a very clean code even with it fixed).
Which begs the question - is this worth doing? Do you have a situation where this is a performance bottleneck?
Obviously this is not even in a top 100 issues worth fixing.
However:
1) I came across it while doing some microbenchmarking for relatively unrelated things and had to create my own copy wrappers, which is unpleasant.
2) The biggest problem here is not copy itself but rather all the algorithms that use copy internally - merge/rotate etc. They can be very reasonable used with reverse iterators.
3) This is a noticeable win: http://quick-bench.com/WgrErKntk_VJgqnVSpFsFy-wXFw
4) I looked in chromium and sometimes (very occasionally) this seem to happen in production: https://cs.chromium.org/chromium/src/third_party/gestures/gestures/include/vector.h?l=187&rcl=5a656849c7d2b0d0ddbe0ac6d300c1e2fada0bb4 https://cs.chromium.org/chromium/src/chrome/browser/ui/toolbar/toolbar_actions_bar.cc?l=308&rcl=7242c905623f9f5243e757e3fbaa35298e9c8c30
In the end, how much do you mind me fixing this?
I'm already halfway done fixing this, however there is a compiler bug that makes unwrapping more complicated than it should be (it's not a very clean code even with it fixed).
Which begs the question - is this worth doing? Do you have a situation where this is a performance bottleneck?
What is the policy guiding this - is it reasonable to wait for the compiler bug to be fixed or libc++ trunk has to compile on older versions than current trunk?
libc++ has to be able to be built/used with older versions of clang.
I'm already halfway done fixing this, however there is a compiler bug that makes unwrapping more complicated than it should be (it's not a very clean code even with it fixed).
What is the policy guiding this - is it reasonable to wait for the compiler bug to be fixed or libc++ trunk has to compile on older versions than current trunk?
assigned to @mclow
https://reviews.llvm.org/D124328 fixes this.
Extended Description
std::copy could unroll 3 reverse_iterators into copy_backward and than do a memmove. Example: https://godbolt.org/g/G6NWRG