llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.56k stars 11.81k forks source link

std::(ranges::)fill on spans does not optimize with -Oz #78671

Open davidben opened 9 months ago

davidben commented 9 months ago

std::span doesn't provide span-based methods for memcpy, memset, etc. I gather this is because we already have more general iterator- or range-based versions of these operations, so there's no need to duplicate the API surface when std::fill, etc., can just be specialized on contiguous iterators.

This works out nicely with -O2, where Clang and libc++ are able to dissolve the abstractions effectively. However, with -Oz, this doesn't happen and we get larger code than with -O2! https://godbolt.org/z/r1snebavx

I assume this is because -Oz's inlining heuristics are stop too early and it don't break down the abstractions that it's meant to break. Not sure if this should be fixed in libc++ or -Oz but, one way or another, this combination isn't great. Moving codebases to something like spans and ranges is a nice safety win, particularly combined with libc++ hardening mode, but the standard way to express this doesn't seem to work well with this combination.

CC @danakj

danakj commented 8 months ago

We looked at the disasm for std::ranges::copy today and saw the same thing there. While a call to memcpy(x, y, sizeof(int)) optimizes nicely, the same call to std::ranges::copy() (where the source has a compile-time-known size) is left as a function call and not inlined.

Edit: This happens with -Oz if your iterators are complex enough (such as the checked contiguous iterators in //base in Chromium). We observe that std::copy() with iterators also optimizes poorly, but with pointers it does well in -Oz.