llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
29.4k stars 12.15k forks source link

std::copy does not optimise for reverse_iterator. #33687

Closed e0e1e613-dccb-47a5-85db-3dc989429265 closed 2 years ago

e0e1e613-dccb-47a5-85db-3dc989429265 commented 7 years ago
Bugzilla Link 34339
Version unspecified
OS All
CC @mclow

Extended Description

std::copy could unroll 3 reverse_iterators into copy_backward and than do a memmove. Example: https://godbolt.org/g/G6NWRG

e0e1e613-dccb-47a5-85db-3dc989429265 commented 7 years ago

Hi!

Is there any chance on landing some form of this? It gives me a pretty nice speedup.

BTW, I think the mechanism that we use here can help us to address the fact that back_inserter is 10 times slower.

http://quick-bench.com/8JSAAEXdeRX7M2hp1iRGKlq9_XI

Of course, you might say that you should not construct with back inserter, but in some places, you cannot avoid that (like in algorithms).

e0e1e613-dccb-47a5-85db-3dc989429265 commented 7 years ago

I just realized, I haven't attached my test files. Sorry about that.

There are a lot of changes, because it's quite hard to implement this staff with traditional SFINAE, I had to redo everything relying on dead code elimination.

e0e1e613-dccb-47a5-85db-3dc989429265 commented 7 years ago

Great! Thx for doing this!

mclow commented 7 years ago

Wow, that's a lot of code churn for a small change. I'll put some concrete comments on the phab link.

e0e1e613-dccb-47a5-85db-3dc989429265 commented 7 years ago

Ping?

e0e1e613-dccb-47a5-85db-3dc989429265 commented 7 years ago

Ping, just in case. I know you were busy with CppCon and stuff.

Best, Denis.

e0e1e613-dccb-47a5-85db-3dc989429265 commented 7 years ago

I went ahead and created a patch: https://reviews.llvm.org/D38653 Please tell me what you think, when you have the time.

e0e1e613-dccb-47a5-85db-3dc989429265 commented 7 years ago

This turned out to be harder than I initially suspected.

I've created a possible solution in the compiler explorer: https://godbolt.org/g/xJH5dA (I wanted to confirm that everything optimizes away). Before I touch libc++, can you please take a look to see if my idea is generally ok?

Another thing that bugs me - is testing. Is there a good way to see if loop or memmove is called without hacking the functions themselves?

mclow commented 7 years ago

In the end, how much do you mind me fixing this?

I don't mind you exploring this at all. In fact, I appreciate it! Once you're ready to share the results, then we can decide if it's worth incorporating it into libc++.

BTW, your godbolt link assumes that the source and dest iterator types are the same. In general, that's not true for std::copy.

e0e1e613-dccb-47a5-85db-3dc989429265 commented 7 years ago

I'm already halfway done fixing this, however there is a compiler bug that makes unwrapping more complicated than it should be (it's not a very clean code even with it fixed).

Which begs the question - is this worth doing? Do you have a situation where this is a performance bottleneck?

Obviously this is not even in a top 100 issues worth fixing.

However:

1) I came across it while doing some microbenchmarking for relatively unrelated things and had to create my own copy wrappers, which is unpleasant.

2) The biggest problem here is not copy itself but rather all the algorithms that use copy internally - merge/rotate etc. They can be very reasonable used with reverse iterators.

3) This is a noticeable win: http://quick-bench.com/WgrErKntk_VJgqnVSpFsFy-wXFw

4) I looked in chromium and sometimes (very occasionally) this seem to happen in production: https://cs.chromium.org/chromium/src/third_party/gestures/gestures/include/vector.h?l=187&rcl=5a656849c7d2b0d0ddbe0ac6d300c1e2fada0bb4 https://cs.chromium.org/chromium/src/chrome/browser/ui/toolbar/toolbar_actions_bar.cc?l=308&rcl=7242c905623f9f5243e757e3fbaa35298e9c8c30

In the end, how much do you mind me fixing this?

mclow commented 7 years ago

I'm already halfway done fixing this, however there is a compiler bug that makes unwrapping more complicated than it should be (it's not a very clean code even with it fixed).

Which begs the question - is this worth doing? Do you have a situation where this is a performance bottleneck?

What is the policy guiding this - is it reasonable to wait for the compiler bug to be fixed or libc++ trunk has to compile on older versions than current trunk?

libc++ has to be able to be built/used with older versions of clang.

e0e1e613-dccb-47a5-85db-3dc989429265 commented 7 years ago

I'm already halfway done fixing this, however there is a compiler bug that makes unwrapping more complicated than it should be (it's not a very clean code even with it fixed).

What is the policy guiding this - is it reasonable to wait for the compiler bug to be fixed or libc++ trunk has to compile on older versions than current trunk?

e0e1e613-dccb-47a5-85db-3dc989429265 commented 7 years ago

assigned to @mclow

philnik777 commented 2 years ago

https://reviews.llvm.org/D124328 fixes this.