Closed ashley-hawkins closed 5 months ago
@llvm/issue-subscribers-clang-modules
Author: None (ashley-hawkins)
I suspect this is not a compiler's issue but a standard library issue.
See my workaround and the comments in https://github.com/alibaba/async_simple/blob/d6f201192bb332fc9dabcfdf3fa29e85c83c7f45/third_party_module/stdmodules/string.cppm#L28-L42.
And it works for https://godbolt.org/z/7Y3r7sM66.
Also the problem disappear too if we're using libc++: https://godbolt.org/z/oPrY3Y8Es
So I think this is a libstdc++ issue. I'll close this as not planned. Feel free to comment or reopen this.
I did a git bisect, and observed that the issue only appears in 1ecbab56dcbb78268c8d19af34a50591f90b12a0 and later.
Is this change just revealing a problem with libstdc++? If it's the case, I will see if it needs to be reported to libstdc++'s bug tracker.
Historically, clang has worked around issues with libstdc++ so that clang users are not forced to use libc++ (that might not be possible for ABI reasons).
That being said, I'm not sure exactly what the issue is here. I see extern template allocator<char>;
in allocator.h
and I see template allocator<char>
in allocator-inst.cc
. I don't know how it could not work not but it used to work before -- I would expect both or neither. This makes me suspect this is a clang issue?
If I'm understanding the linked commit (https://github.com/llvm/llvm-project/commit/1ecbab56dcbb78268c8d19af34a50591f90b12a0) correctly, then I'm also surprised by that behavior. It seems like that commit turns something that literally says "always inline this" into "unless this only under these conditions". I would expect always_inline to be stronger than inline, not stronger within a module but weaker between modules. There is a reference to ABI requirements in that commit, but I don't see the issue here -- someone who marks a function always_inline isn't going for ABI stability as they change that function. Does this mean that people who want the 'old' behavior would have to say both inline and always_inline?
That being said, I'm not sure exactly what the issue is here. I see extern template allocator
; in allocator.h and I see template allocator in allocator-inst.cc. I don't know how it could not work not but it used to work before -- I would expect both or neither. This makes me suspect this is a clang issue?
This should not be relevant to clang. Since I can't see the symbols in libstdc++.so, which should be compiled from GCC. So I believe this may be a GCC defect. But,
Historically, clang has worked around issues with libstdc++ so that clang users are not forced to use libc++ (that might not be possible for ABI reasons).
Yeah, I've received multiple complaints about this too. So probably I need to do some workaround in clang even if I think it is a GCC issue.
If I'm understanding the linked commit (https://github.com/llvm/llvm-project/commit/1ecbab56dcbb78268c8d19af34a50591f90b12a0) correctly, then I'm also surprised by that behavior. It seems like that commit turns something that literally says "always inline this" into "unless this only under these conditions". I would expect always_inline to be stronger than inline, not stronger within a module but weaker between modules. There is a reference to ABI requirements in that commit, but I don't see the issue here -- someone who marks a function always_inline isn't going for ABI stability as they change that function. Does this mean that people who want the 'old' behavior would have to say both inline and always_inline?
My thought was, the inline
key words in C++ standard means linkage, and the always_inline
attribute is an extension to pass the information to the optimizer. They looks similar but not the same thing from my side. But probably I have to add back the original behavior due to the reason above.
That being said, I'm not sure exactly what the issue is here. I see extern template allocator; in allocator.h and I see template allocator in allocator-inst.cc. I don't know how it could not work not but it used to work before -- I would expect both or neither. This makes me suspect this is a clang issue?
This should not be relevant to clang. Since I can't see the symbols in libstdc++.so, which should be compiled from GCC. So I believe this may be a GCC defect. But,
Please could you file a Bugzilla for GCC/libstc++ (https://gcc.gnu.org/bugzilla/) in that case; there is an intent that the compilers and libraries can interoperate.
Since I can't see the symbols in libstdc++.so, which should be compiled from GCC.
The functions that generate those symbols only exist for __cplusplus >= 202002L
because they only need to exist for C++20 constexpr (de)allocations, which must be (de)allocated by std::allocator<T>::allocate
and not some_base_class::allocate
. So for C++20 there are always_inline
forwarding functions that just call the base class allocate
and deallocate
. They should not need to be exported from the .so because nobody ever depends on them: they're always inlined.
It seems that modules changes this picture, and extern symbols are now needed for these always_inline functions.
Compiling the explicit instantiation definitions as C++20 instead of C++98 causes the explicit instantiation definitions to include those functions, so they're in the .so
That change might be OK for GCC 14 still, but backporting to GCC 13 would be complicated.
David wrote:
Does this mean that people who want the 'old' behavior would have to say both inline and always_inline?
N.B. GCC requires that anyway; always_inline
is ignored if it's used on a non-inline function. That shouldn't be relevant here though, because std::allocator<char>::allocate
is defined in the class body so is implicitly inline.
So if the linked LLVM commit is supposed to "not export non-inline function bodies even if marked always_inline" then why does it change anything here? The affected functions are inline. Is there a bug where functions that are implicitly inline due to being defined in the body are somehow being treated as not inline because they don't have a redundant inline
keyword? Or is the explicit instantiation declaration for std::allocator<char>
causing LLVM to treat all its members as not-inline-functions even when they are actually inline functions?
Edit: Oh, and those functions are also marked constexpr
so that makes them inline too. So they're doubly inline!
The commit was to fix #80949 which has a non-inline function marked always_inline
, which is ... just wrong? Don't do that?
But I don't see why that's caused this regression with libstdc++'s std::allocator
. As I said in the comment immediately above this one, those functions are inline functions. So if LLVM really wants to support always_inline
on non-inline functions, fine ... you do you. But why is it changing the behaviour for always_inline
on inline functions?
Here is a minimal reproducer without libstdc++ involved: https://godbolt.org/z/7eMbxoxY3
Or is the explicit instantiation declaration for std::allocator
causing LLVM to treat all its members as not-inline-functions even when they are actually inline functions?
Yeah, this is the case. The linkage std::allocator<int>::allocate
is treated as inline linkage. But the explicit instantiation std::allocator<char>::allocate
makes LLVM think we can find the symbol elsewhere. So the linkage of std::allocator<char>::allocate
is not inline at least in the CodeGen phase of LLVM.
In another point of view, without modules, for extern <explicit-template-instantiation>
and actually instantiate it in other .cc
files, we don't inline the explicit instantiations, right? So I feel good with the behavior.
In another point of view, without modules, for
extern <explicit-template-instantiation>
and actually instantiate it in other.cc
files, we don't inline the explicit instantiations, right?
I'm not sure what you mean by "inline the explicit instantiations". An inline function which has an explicit instantiation declaration is still eligible to be inlined. Especially if it's marked always_inline
.
The compiler is not forced to use an explicit instantiation definition in some other TU, it's allowed to do an implicit instantiation and inline it. And I think for an always_inline
function that's exactly what should happen. It's supposed to be always inlined!
And when modules aren't involved, Clang seems to agree with GCC there. The std::allocator<char>::deallocate
function has never been present in libstdc++.so
and yet you can still use std::string
in C++20, because the always_inline allocate
and deallocate
functions get inlined, always.
If they were not being inlined, and an extern symbol was being expected, you would never be able to link any code using libstdc++'s std::string
in C++20.
In another point of view, without modules, for
extern <explicit-template-instantiation>
and actually instantiate it in other.cc
files, we don't inline the explicit instantiations, right?I'm not sure what you mean by "inline the explicit instantiations". An inline function which has an explicit instantiation declaration is still eligible to be inlined. Especially if it's marked
always_inline
.The compiler is not forced to use an explicit instantiation definition in some other TU, it's allowed to do an implicit instantiation and inline it. And I think for an
always_inline
function that's exactly what should happen. It's supposed to be always inlined!
This is what I mean: https://godbolt.org/z/TK5Ye9z1P, both GCC and Clang don't generate the body for A<int>::func
in the current TU due to it is using extern <explicit-template-instantiation>
.
Then let's forget always_inline
for a while, I want to say, if we're using extern <explicit-template-instantiation>
, we need to ensure the symbols to be exist in some other TU. Then in the example, the libstdc++ uses extern template class allocator<char>
but didn't provide the symbol for allocator
and dealloactor
. So this the reason why I think it is a defect.
Then for always_inline
, it is just a hint to the optimizer but it shouldn't change the semantics of the program. But in the current case, it changes the invalid code into valid code surprisingly. It just works but it is not strictly well defined.
always_inline
is not just a hint though.
We use it in libstdc++ specifically to deal with the case where a symbol is not part of the explicit instantiation definition in the .so e.g. for symbols that are new C++26 features, which are not exported from the .so yet.
The GCC docs for always_inline
say "Failure to inline such a function is diagnosed as an error." which means that the function is always inlined. That's not a hint, it's a guarantee. No symbol is needed for a function that's always inlined.
Now if modules change the equation, we need to consider what changes. But I don't think we can just say "forget always_inline for a while" when that attribute is being used specifically to avoid the need for an extern symbol. The whole issue hinges on the use of that attribute.
If Clang wants to make this change and thereby require ABI changes to libstdc++.so then I think either the change should be delayed until libstdc++ can be updated to deal with it, or Clang should be very clear that you can't use Clang 19 with libstdc++ and modules, because clang chose to change something that was previously guaranteed and relied on by libstdc++.
Patch to provide those symbols in libstdc++: https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649312.html
(IMHO) we should make sure that we understand the behaviour we want in this case, before amending ABIs.. it seems that there could have been a misunderstanding about what is expected of always_inline
c.f inline
.
e.g If we
always_inline
functions into the BMI, always_inline
ing fails (either from a module BMI import - or from a textual case)does that work?
I don't know, I don't know what effect emitting it in the BMI would have because I don't understand modules. But I wonder why the always_inline function isn't just inlined in the first place. Why is there an undefined reference to it?
Here's the minimal reproducer again: https://godbolt.org/z/7eMbxoxY3
This gets an undefined reference to the always_inline function, because it's called from a function that's exported from the module. Why isn't it inlined? Why doesn't repro::~X()
in module.cc just inline the always_inline function, so there's no need for a symbol? Isn't that what always_inline is supposed to mean?
ChuanqiXu9 said:
This is what I mean: https://godbolt.org/z/TK5Ye9z1P, both GCC and Clang don't generate the body for
A<int>::func
in the current TU due to it is usingextern <explicit-template-instantiation>
.
But that's not the same. Yes, in that case of course there's an undefined ref to A<int>::func
and of course it's not defined in the current TU. Because that's what the explicit instantiation declaration says to do.
But if the same example is compiled with optimization enabled, even at -O1, then there's no undefined reference. A<int>::func
is inlined and the function just returns 43. And if you use always_inline
so that it's always inlined, that happens even without -O1. Because that's what the always_inline attribute says to do.
Why does an always_inline function used by a function that is exported from a module not get inlined? I'm not expecting the always_inline function to be exported, but I do expect it to be inlined. Because that's what the attribute says to do. In fact, the code should not even compile if it's not inlined. Why is that code giving a linker error? Either the function should be inlined (and there's no undefined reference) or it should be a compiler error. That's what the attribute means, by definition. Inline it or die trying.
The bug that the behaviour change was intended to fix (#80949) doesn't seem to be the same as this problem, that has an always_inline function (which is not an inline function) that is also supposed to be exported. I have no idea what that is meant to do. But that's not the problem here. The deallocate
function in https://godbolt.org/z/7eMbxoxY3 doesn't need to be exported. It just needs to be inlined into its caller, and the caller is exported.
It seems to me that a more reasonable response to #80949 would have been "you can't export always_inline functions, that doesn't work". Instead the documented always_inline guarantee (which working code in the real world relies on) has been broken to support a novel use of the attribute with modules, something that has never been officially described as supported in any documentation.
It seems that the underlying issue is not especially modules-related; instead it is that clang is treating the always_inline
as a hint, but the semantic is meant to be "do it or emit a fatal diagnostic"
So first we need to decide what to do about this in general - we can then decide on whether always_inline bodies need to be in the BMI (probably only if they are exported, I suppose).
There are some separated topics in the thread now. Let's try to separate them clear.
I think that it is not strictly correct to make the behavior of the program (or the ABI) relies on always_inline
. So always_inline
changes the ABI of the library. This looks not good. I see no mention of always_inline
in Itanium ABI (https://itanium-cxx-abi.github.io/cxx-abi/abi.html). Also I see the always_inline
as a requirement to the optimizer. Then the behavior of our program relies on the optimizer... this looks not good. And this makes the attribute not ignorable. I know it works. But I still think the behavior is not well defined. Or maybe we're discussing a not well defined problem from the very beginning due to we're talking about vendor's extentsions : )
On the practical side, given I've received many complaints, and even if libstdc++ fixes the issue in their side, the end users still need to update the libstdc++ to get the correct behavior, which is not very user friendly. So I've decided to use a new mechanism to make sure users won't meet this problem again.
Previously I thought I need to do something fancy about static local variables. But the ideas of forbidding non-inline always_inline function in module interfaces looks promising. I don't understand why export matter here. It should be good to export an always_inline inline function. (The first inline is a requirement to the optimizer and the second 'inline' means its linkage. So it should be well defined even if it looks redundant.)
I think that it is not strictly correct to make the behavior of the program (or the ABI) relies on always_inline.
I disagree.
So always_inline changes the ABI of the library. This looks not good.
Adding always_inline
to an existing API might change its ABI, but that's not relevant here. The functions never existed before, and were added with always_inline
on day one. So the attribute doesn't change the API, it defines it. It's an intentional part of the library ABI.
Please leave it to library authors to worry about how they define their ABI instead of deciding that clang knows better.
Previously I thought I need to do something fancy about static local variables.
You've broken a valid use case to support a crazy use case with local statics in functions with multiple definitions.
I don't understand why export matter here.
Because you should be able to inline the deallocate
function into ~X()
and it makes absolutely no difference to the exported functions in the BMI. The deallocate
function isn't being exported anyway.
It should be good to export an always_inline inline function.
Why? deallocate
isn't supposed to be exported from the module.
I think that it is not strictly correct to make the behavior of the program (or the ABI) relies on always_inline.
I disagree.
So always_inline changes the ABI of the library. This looks not good.
Adding
always_inline
to an existing API might change its ABI, but that's not relevant here. The functions never existed before, and were added withalways_inline
on day one. So the attribute doesn't change the API, it defines it. It's an intentional part of the library ABI.Please leave it to library authors to worry about how they define their ABI instead of deciding that clang knows better.
What I mean is, the ABI now depends on the attribute always_inline
. Now the optimization "inlining" is not a pure optimization but a key factor of the semantics. This is somehow a mess to me.
Previously I thought I need to do something fancy about static local variables.
You've broken a valid use case to support a crazy use case with local statics in functions with multiple definitions.
I don't understand why export matter here.
Because you should be able to inline the
deallocate
function into~X()
and it makes absolutely no difference to the exported functions in the BMI. Thedeallocate
function isn't being exported anyway.It should be good to export an always_inline inline function.
Why?
deallocate
isn't supposed to be exported from the module.
Maybe we're not on the same page. I'm not saying deallocate
here. I'm saying how should we handle the problem generally. What I mean is:
export module a;
export inline __attribute__((always_inline)) int f() {...}
should be valid while
export module a;
export __attribute__((always_inline)) int f() {...}
may not be valid.
We could separate this into two issues:
__attribute__((always_inline))
, which is a vendor extension - but it does not implement the required semantics; choices might be:
The vendor extension __attribute__((always_inline))
is a long-standing one, it is not new - it is not going to go away.
About the clang's behavior about always_inline
always_inline
then all the use of the function get inlined. So there is no require to the symbol in the user's object.For what libstdc++ need to do:
For what clang will do:
Sorry if the title is not very clear. Basically if in my main.cc I'm importing a module, and that module exports a function which returns std::string, and I call that function from main.cc, it will fail to compile on latest clang with a linker error that
I tried to make a minimal reproducer but I'm not sure whether or not this is actually minimal, because I don't know if it could be reproduced without std::allocator or if it's specifically a problem with std::allocator. In module.cc you can change the return type of repro::repro to
std::string
to see that it also happens withstd::string
. It also happens withstd::vector<char>
, andstd::deque<char>
, and I guess any other standard container would have the same problem with an element type of char. I wasn't able to reproduce without using modules, which is why I labelled it as a modules problem.Compiler Explorer reproducer for 19.0.0git: https://godbolt.org/z/5GbczMEa7 Compiler Explorer example of inability to reproduce for 18.1.0: https://godbolt.org/z/EEajTa1nj (same code as in above I just changed the compiler to x86-64 Clang 18.1.0)
As of writing this:
Problematic version of Clang tested on Compiler Explorer:
Problematic version of Clang that I originally found the problem on, from apt.llvm.org:
Working version of Clang tested on Compiler Explorer: