itanium-cxx-abi / cxx-abi

C++ ABI Summary
507 stars 95 forks source link

[C++20] [Modules] Do we need the concept of `key function` for class defined in module purview? #170

Open ChuanqiXu9 opened 1 year ago

ChuanqiXu9 commented 1 year ago

See https://github.com/llvm/llvm-project/issues/70585 for the motivation issue.

Simply, the current clang's behavior violates itanium ABI 5.2.3:

The virtual table for a class is emitted in the same object containing the definition of its key function, i.e. the first non-pure virtual function that is not inline at the point of class definition.

But there are opinions about: is the concept of key function necessary after we introducing modules? So I bring the issue here.

My thought about the key function is that: in the headers model, the class may be defined in many TUs, then we need to find a TU to generate the virtual table. But after we introducing modules, if we define the class is defined is a module purview, we can get the TU to generate the virtual table naturally.

My understanding to the issues is: while the original old rule works, maybe it is better to have some new rules with the new features technically.

rjmccall commented 1 year ago

I completely agree. I opened PR #171 to address this, as well as several related issues I saw while updating this section.

ChuanqiXu9 commented 11 months ago

Since https://github.com/itanium-cxx-abi/cxx-abi/pull/171 is merged, we can close this.

jicama commented 10 months ago

Why does this not apply to all vague linkage entities defined in module purview? Shouldn't inline variables and functions also be emitted in the module unit that defines them? [oops, didn't mean to reopen, but I guess it's fine for the moment]

rjmccall commented 10 months ago

I think you're right that we could, at least for non-templated entities. It's an interesting question whether we should.

If we eagerly emit a strong definition and then don't actually need it (e.g. because we inline every use of it), we'll end up having to hope the linker dead-strips it. The linker can probably do that reliably unless it's linking a shared library, in which case it probably can't do it at all. (Unfortunately, modules still don't tell us anything about how the module and its dependencies are assembled in the overall program, so default visibility still has to be the default.)

On the other hand, of course, if we don't eagerly emit a strong definition, we're still stuck in the world of today where compilers waste huge amounts of time and energy emitting thousands of redundant definitions of inline functions into every translation unit, especially in unoptimized builds. But my sense is that most of that overhead in practice comes from templates, which modules don't help us with at all.

I don't know. Changing the ABI here would be a huge change. It's probably our only shot to make a significant dent in the redundant emission problem, though.

ChuanqiXu9 commented 10 months ago

I feel this is a pretty complex topic. For a similar case, WG21 choose to make in-class member functions in module purview to be non implicitly inline.

Also the inlined entities in module purview are helpful for optimizations. For this example:

export moudle a;
export int a() {
    return 43;
}

Currently, the consumer of module a can't inline the body of function a unless LTO is enabled. This raised some concern about performance when I share the topic. But I said, "it is still OK to mark function a as inline if you really want that". But if we force all inline functions to be defined in the module purview, the users lose the chance to control the inlining. (I know it is known as bad idea to control compiler optimization in the user side).

Then for templates, maybe it will be an compile-time optimization to only emit the implicitly instantiated templates in the module unit. But we don't have a feeling how good will it be. From the perspective of compile-time optimization, maybe we should try to experiment this with a flag as an compiler extension to get a feeling. Then we can make further proposals.

jicama commented 10 months ago

For a similar case, WG21 choose to make in-class member functions in module purview to be non implicitly inline.

Indeed. I don't remember the rationale for that change.

But if we force all inline functions to be defined in the module purview, the users lose the chance to control the inlining. (I know it is known as bad idea to control compiler optimization in the user side).

I don't understand your point; an inline function can be emitted out-of-line and also still inlined.

Then for templates, maybe it will be an compile-time optimization to only emit the implicitly instantiated templates in the module unit. But we don't have a feeling how good will it be. From the perspective of compile-time optimization, maybe we should try to experiment this with a flag as an compiler extension to get a feeling. Then we can make further proposals.

The CMI could indicate which imported vague linkage entities were emitted in the module unit .o, so importers don't need to emit them as well? That would still require consistency between the compilations that produce the CMI and the .o, which might not be the same compiler invocation.

ChuanqiXu9 commented 10 months ago

an inline function can be emitted out-of-line and also still inlined.

What do you mean by emitting out-of-line? I didn't get this.

The CMI could indicate which imported vague linkage entities were emitted in the module unit .o, so importers don't need to emit them as well?

Yes, this is possible. There is a similar logic to handle non-inline entities.

rjmccall commented 10 months ago

an inline function can be emitted out-of-line and also still inlined.

What do you mean by emitting out-of-line? I didn't get this.

I assume Jason is talking about emitting a definition just for the benefit of local optimization. The idea is that inlining and function analysis should be able to see the definition, but if the compiler ultimately decides to emit a reference to it, it's just treated like an external symbol reference. In LLVM, this would be available_externally. A lazy implementation of this would be to recognize direct calls and forward them to a copy of the function with a different symbol and internal linkage, but then of course you don't get the code-size benefits of falling back on a common implementation.

mathstuf commented 10 months ago

For a similar case, WG21 choose to make in-class member functions in module purview to be non implicitly inline.

Indeed. I don't remember the rationale for that change.

IIRC, module function bodies have reachability effects and implicit inline forced those wanting to control that to write implementations elsewhere in the file. This was decided in Cologne 2019. See paper P1779R0.

rjmccall commented 10 months ago

Does the lack of those effects effectively force the use of unique emission, or is vague linkage still viable?

jicama commented 10 months ago

IIRC, module function bodies have reachability effects and implicit inline forced those wanting to control that to write implementations elsewhere in the file.

Right, thanks. And under https://eel.is/c++draft/basic.link#17 it's ill-formed to refer to a TU-local entity in an exported inline definition, so the change was to allow more ABI isolation by not exporting the bodies of in-class member functions by default.

I don't think this provides significant input into the current question, which has to do with functions (and variables) that are actually declared inline.

As for templates, obviously all instantiations can't be emitted in the module unit, but any instantiations that are generated during compilation of the module unit could be.

dwblaikie commented 10 months ago

As for templates, obviously all instantiations can't be emitted in the module unit, but any instantiations that are generated during compilation of the module unit could be.

Yep.

So originally when I implemented -fmodules-codegen for Clang Header Modules it was to prototype the idea being discussed here - emitting inline function definitions (including implicit template instantiations) as weak definitions in a modular object file - then emit them as available_externally (for clang's purposes) when optimizations were enabled in users of those modules. These definitions wouldn't have to be singular (though, yes, with C++20 modules they could be made strong/singular definitions).

The data I got was related to @rjmccall's estimations: It's hard to know if/where this'll be a win or a loss.

For -O0 builds of Google binaries, it was about a wash. The number of extra definitions that were emitted that would've been avoided entirely balanced out the number of duplicate definitions that were no longer emitted. This might've been especially weighted by heavy use of protobufs - substantial amounts of generated code where many functions would be emitted and possibly relatively few would ever be called. (& also the whole binary wasn't modularized, we only modularized libraries up through protobufs and not beyond since it wasn't worth it to do all the layering fixes beyond that)

So prototyping this with an opt-out for generated code like modules might be worth measuring and seeing how that looks. If that produced a positive outcome it'd point to maybe making this part of the ABI but with some opt-out that could be used by things like protobufs.

The other problem that arises is optimized code - then even if the inline function is called, it might be inlined, so there's even less chance a homed definition will end up being needed. So in optimized builds it didn't work out so well.

(oh, hey, I even gave an LLVM talk with some of the numbers in it: https://www.youtube.com/watch?v=lYYxDXgbUZ0 )

When @zygoloid implemented C++20 modules code generation, I think he opted them out of this homed-inline-functions feature, and I think I asked on the review but never learned exactly what motivated the change there. Though the unconvincing numbers are a pretty reasonable answer to that question, perhaps there was other data too.

jicama commented 10 months ago

Interesting video, thanks. That does suggest that we probably don't want to mess with inline functions.

But even putting inline functions aside, I think there's a strong case for treating inline variables the same as vtables.

Possibly only inline variables that aren't usable in a constant expression, so references can't actually be inlined away.

Also, any non-inline template instantiations that happen to be generated.

ChuanqiXu9 commented 10 months ago

When @zygoloid implemented C++20 modules code generation, I think he opted them out of this homed-inline-functions feature, and I think I asked on the review but never learned exactly what motivated the change there.

I guess the reason may be that violates the ABI specification that time.

But even putting inline functions aside, I think there's a strong case for treating inline variables the same as vtables.

IIUC, do you want to say:

export module a;
export inline int a = 43;

should have the same impact as

export module a;
export int a = 43;

Do I understand correctly? If yes, I strongly suggest to send a paper to WG21 to forbid such uses if there is a strong motivation. Since it will be a big confusing to talk about the concept inline then (it is already confusing enough now)

ChuanqiXu9 commented 10 months ago

an inline function can be emitted out-of-line and also still inlined.

What do you mean by emitting out-of-line? I didn't get this.

I assume Jason is talking about emitting a definition just for the benefit of local optimization. The idea is that inlining and function analysis should be able to see the definition, but if the compiler ultimately decides to emit a reference to it, it's just treated like an external symbol reference. In LLVM, this would be available_externally. A lazy implementation of this would be to recognize direct calls and forward them to a copy of the function with a different symbol and internal linkage, but then of course you don't get the code-size benefits of falling back on a common implementation.

Previously, the following function bodies can be inlined into consumers via available_externally in clang:

export module a;
export int a() { return 43; }

But in a discussion in SG15 later, we think the behavior violates the ABI isolation. That said, we think the body of the non-inline function a() is an implementation detail and it is (could be not) a part of the implementation. So it should be possible to not require the consumers to recompile them selves, if only the implementation changes. In another word, if there is a user of module a and it got compiled into user.o, then we change the implementation of a() into return 44;, it should be valid to link the new compiled a.o to the old user.o to get a valid program instead of requiring to generate a new user.o.

rjmccall commented 10 months ago

Are there precise rules about the ABI expectations with modules? I haven't been following this closely, and I really have no idea what's expected to be allowed to change and what isn't anymore.

ChuanqiXu9 commented 10 months ago

Are there precise rules about the ABI expectations with modules? I haven't been following this closely, and I really have no idea what's expected to be allowed to change and what isn't anymore.

It is hard to tell concisely and precisely. Since in our discussion, we didn't image/expect to touch/change the ABI specification. My understanding of our consensus is: all the definitions of non-inline entities (functions and variables) can be excludes from the (abstract) interface part. Inline entities were exceptions in our previous discussion. I guess the major reason should be that we didn't expect to change the ABI specification.

BTW, MSVC (I remember they don't follow Itanium C++ ABI) follows the same behavior: keep inline entities as exceptions.

rjmccall commented 10 months ago

It is hard to tell concisely and precisely. Since in our discussion, we didn't image/expect to touch/change the ABI specification.

Okay. The committee needs to understand that that's not how this works, alright? Whenever the language adds new constructs, compilers have to figure out how to compile them, and that needs to go into this specification. Modules add quite a few new constructs, and they have non-trivial implications for compilers because they create new kinds of information flow between translation units. The traditional understanding of ABI boundaries in C++ has always been that any information in the translation unit is fair game for the compiler to use. A perhaps-naive understanding of how importing a module interface unit works is that the exported declarations from that module are now available in the importing translation unit essentially as if they were declared there (just preserving the knowledge that they in fact came from a different module). So if the idea is now that some of the information in a module interface unit is not supposed to be used by its importers, that's an important shift. That's true even if you've identified a reasonable formalism that seems to unify previous behavior with the new intended interpretation, because compilers need to make sure they're implementing the intent of that formalism.

Let me try to restate the rule you're laying out so that we know we're in agreement. It sounds like the intended rule here is that the contents of function and variable definitions in the module interface unit are not part of the abstract interface of the module and should not be available to importers of the module unless the definition is explicitly marked inline. That mostly seems like an implementable rule, but I have a few questions:

ChuanqiXu9 commented 10 months ago

It is hard to tell concisely and precisely. Since in our discussion, we didn't image/expect to touch/change the ABI specification.

Okay. The committee needs to understand that that's not how this works, alright? Whenever the language adds new constructs, compilers have to figure out how to compile them, and that needs to go into this specification. Modules add quite a few new constructs, and they have non-trivial implications for compilers because they create new kinds of information flow between translation units. The traditional understanding of ABI boundaries in C++ has always been that any information in the translation unit is fair game for the compiler to use. A perhaps-naive understanding of how importing a module interface unit works is that the exported declarations from that module are now available in the importing translation unit essentially as if they were declared there (just preserving the knowledge that they in fact came from a different module). So if the idea is now that some of the information in a module interface unit is not supposed to be used by its importers, that's an important shift. That's true even if you've identified a reasonable formalism that seems to unify previous behavior with the new intended interpretation, because compilers need to make sure they're implementing the intent of that formalism.

Thanks for write-up! I'll try to send this to SG15 to get a precise and formal expectation.

(Also I want to clarify that what I said is a conclusion from the mailing list of SG15. And SG15 is tooling study group. Generally its outcome won't be merged into the tranditional C++ specs. So it might not stand for "the committee".)

Let me try to restate the rule you're laying out so that we know we're in agreement. It sounds like the intended rule here is that the contents of function and variable definitions in the module interface unit are not part of the abstract interface of the module and should not be available to importers of the module unless the definition is explicitly marked inline.

The current behavior (instead of calling it rule now) doesn't require it to be marked as explicitly inline. We'll drop the non-inline definition in the BMI (built module interface) so that the importers won't see the definition.

( for BMI, it is generated from the module interface unit and the importers will load the BMI to get the information from that module. The module interface unit will also generate an object file like traditional TU. There may not be dependencies between the generated BMI and the object file for a module interface unit. So we can image the BMI and the object file of a module interface unit are generated in two processes:

CC m.cppm -generate-BMI -o m.pcm
CC m.cppm -c -o m.o

)

That mostly seems like an implementable rule,

Yes. This is the reason why we only discuss it in SG15 instead of EWG.

but I have a few questions:

  • This doesn't apply to template definitions, right? That would clearly not be implementable; clients have to be able to instantiate the definition to use the declaration.

We didn't think about template definitions but template instantiations. So if the linkage of a template instantiation is not inline linkage (maybe this is what you call as vague linkage? I am not sure), yeah the same rule applies.

This is possible via explicit template instantiations:

// a.cppm
export module a;
export tempalte <class C>
int foo() { ... }

// explicit instantiate `foo` with `int`. So we will generate function body `foo<int>`
// in a.o
export template int foo<int>();

// user.cc
import a;
int x = foo<int>(); // We don't see the funciton body of `foo<int>()` here.

We think this is an important optimization technique with modules.

For implicit instantiations, as far as I know, their linkages are inline linkages, so they won't fall into here.

  • Does this apply to = default definitions? At least for special members, we need to able to analyze whether the operation is trivial in order to decide certain basic things about the ABI. This is implementable either way: we already say that special members defaulted outside of the class definition aren't trivial for the purposes of deciding whether the class is trivially copyable, and we could refine that to say that in-class defaulted members of classes in modules also have to be inline. (I think defaulted implicit declarations of special members would need to be implicitly inline, though, or else modularizing a C header would create ABI incompatibility.)

Oh, interesting. I didn't test this. I just checked the current behavior in clang and it turns to that the rule doesn't apply to = default definitions. I thought this is a bug. Since I remembered the in-class function definitions are implicit inline. But I find the wording implies that the current behavior is correct: http://eel.is/c++draft/dcl.fct.def.default#3.

At least for special members, we need to able to analyze whether the operation is trivial in order to decide certain basic things about the ABI.

We don't need to worry about semantical analyzing since all the information is preserved in the BMI.

  • What is the expectation within a module? In particular, given a file-by-file translation model for the module implementation units, do compilers need to potentially recompile the entire module if the programmer changes a non-inline definition in the module interface unit, or are they blocked from using that definition without LTO?

The module implementation units are not special. They are treated as module consumers too. So if a non-inline definition in the module interface unit changed, it is allowed to not recompile to module implementation units.

are they blocked from using that definition without LTO

I feel the word 'blocked' may not accurate. It is actual valid to recompile them. What we(SG15) want is to allow the behavior to skip the recompilations for the module consumers if only the non-inline definitions in module interface units change. But it is actually valid to recompile them all the time. In fact, currently the timestamp-based build systems will always recompile them if there is any change to the module interface unit. So they are a model in ideal.

jicama commented 10 months ago

But even putting inline functions aside, I think there's a strong case for treating inline variables the same as vtables.

IIUC, do you want to say:

export module a;
export inline int a = 43;

should have the same impact as

export module a;
export int a = 43;

In that in both cases the module unit contains a definition that the importer can refer to instead of emitting its own, yes.

Do I understand correctly? If yes, I strongly suggest to send a paper to WG21 to forbid such uses if there is a strong motivation. Since it will be a big confusing to talk about the concept inline then (it is already confusing enough now)

Forbid exported non-constant inline variables? I suppose they are pretty useless, and I would support warning about them, but forbidding them seems excessive.

Anyway, I'm confident we don't want to forbid

export struct A {
  static constexpr int table[] = { ... };
};

How do I write my module such that A::table is available for constant evaluation, but an importer doesn't need to emit its own copy for any non-constant uses? That seems like the behavior that we would want, but there's no way to express it currently. Though I suppose we might revive the deprecated out-of-class redeclaration to mean that.

Why treat A::table differently from a vtable?

Of course, a plain int constexpr variable will usually be "inlined" away and the definition only needed if its address is taken, much like a simple inline function. But the cost of emitting a definition of such a variable should be small, and there are typically far fewer such variables than inline functions.

ChuanqiXu9 commented 10 months ago

But even putting inline functions aside, I think there's a strong case for treating inline variables the same as vtables.

IIUC, do you want to say:

export module a;
export inline int a = 43;

should have the same impact as

export module a;
export int a = 43;

In that in both cases the module unit contains a definition that the importer can refer to instead of emitting its own, yes.

Let me try to clarify.

export module a;
export int a = 43;

will generate the definition of variable a in the object file for module a. While

export module a;
export inline int a = 43;

won't generate the definition of variable a in the object file for module a but in the TUs which use the variable a actually.

Then you're proposing to warn for the second case?

Do I understand correctly? If yes, I strongly suggest to send a paper to WG21 to forbid such uses if there is a strong motivation. Since it will be a big confusing to talk about the concept inline then (it is already confusing enough now)

Forbid exported non-constant inline variables? I suppose they are pretty useless, and I would support warning about them, but forbidding them seems excessive.

Anyway, I'm confident we don't want to forbid

export struct A {
  static constexpr int table[] = { ... };
};

How do I write my module such that A::table is available for constant evaluation, but an importer doesn't need to emit its own copy for any non-constant uses? That seems like the behavior that we would want, but there's no way to express it currently. Though I suppose we might revive the deprecated out-of-class redeclaration to mean that.

Why treat A::table differently from a vtable?

Then you're saying A::table should be emitted in the module unit. And A::table is implicitly inline, so you're proposing we should generate the inline variable definitions (in module purview) in the module unit instead of other TUs which use it?

Then you're proposing we should warn the use of explicitly inline variables in module purview and change the linkage of implicitly inline variable? Do I understand right?

If yes, I still feel we can require the language to not make the currently implicitly inline variable (like constexpr variables) to be implicitly inline any more. I feel this is more consistent.

It is not conflicting with constant evaluation or optimizations. Since with modules, we introduce a new layer called BMI. Then we can emit the definition of constexpr variables to the BMI so that the users of the BMI can see the definition. But we can choose to not generate its definition to the object files of its users (or we can generate it as available_externally if we want to trigger more optimizations in the middle end.)

Of course, a plain int constexpr variable will usually be "inlined" away and the definition only needed if its address is taken, much like a simple inline function. But the cost of emitting a definition of such a variable should be small, and there are typically far fewer such variables than inline functions.

I didn't get the point of the paragraph. But I feel it may not be relevant any more.

rjmccall commented 10 months ago

It's pretty common for classes doing complex bit-masking to build up their masks in static constexpr member variables; I wouldn't want compilers to start emitting all of those eagerly. I know it's a relatively small expense, but it's a totally unnecessary and unwanted expense.

In general, I feel it's pretty easy to distinguish v-tables from ordinary variables. If a dynamic class is used at all, its v-table will almost certainly have to be emitted; the compiler being able to completely optimize away any need for the v-table of an object is very uncommon in practice. Furthermore, v-tables tend to be somewhat large, and emitting one also requires the emission of RTTI and (usually) a significant number of inline virtual functions. It's certainly possible that all of this could also apply to an arbitrary variable — if nothing else, a user could create their own v-tables — but I think it's not unreasonable to say that programmers doing that ought to be expected to make smart decisions about how those variables will be emitted. We should use a simple, predictable rule here that still gives the programmer that control over emission.

jicama commented 10 months ago

If yes, I still feel we can require the language to not make the currently implicitly inline variable (like constexpr variables) to be implicitly inline any more. I feel this is more consistent.

That would work for me, and would be consistent with the change in in-class function definitions. And for constexpr variables it's easier to separate the linkage from the usability-without-symbol-reference, since they correspond to different keywords, unlike with functions.

It is not conflicting with constant evaluation or optimizations. Since with modules, we introduce a new layer called BMI. Then we can emit the definition of constexpr variables to the BMI so that the users of the BMI can see the definition. But we can choose to not generate its definition to the object files of its users (or we can generate it as available_externally if we want to trigger more optimizations in the middle end.)

Yes. For both functions and constexpr variables, I want to have a way to get inlining without inline linkage, which it sounds like matches available_externally in Clang.

The question is under what circumstances that linkage applies. I was suggesting above that the currently deprecated redeclaration at namespace scope (https://eel.is/c++draft/depr.static.constexpr) might be a suitable way to express it, if a bit cumbersome. Though (currently) that's not allowed for functions (https://godbolt.org/z/GrMMGzd8c).

ChuanqiXu9 commented 10 months ago

It's pretty common for classes doing complex bit-masking to build up their masks in static constexpr member variables; I wouldn't want compilers to start emitting all of those eagerly. I know it's a relatively small expense, but it's a totally unnecessary and unwanted expense.

We should use a simple, predictable rule here that still gives the programmer that control over emission.

The question is under what circumstances that linkage applies.

The model in my mind is, in

export module A;
export struct A {
  static constexpr int table[] = { ... };
};

The definition of A::table will always be emitted in A.o and the objects of the consumers can only see its declaration. (which is external linkage in clang).

and the programmer can control the behavior by adding an inline keyword.

export module A;
export struct A {
  static inline constexpr int table[] = { ... };
};
// end of File

Then the definition of A::table won't be generated in A.o and it will be generated in the objects of the consumers where A::table is referenced. (which is inline linkage in clang and linkonce_odr in LLVM).

I feel it is straight forward, consistent and simple. WDYT?

It's pretty common for classes doing complex bit-masking to build up their masks in static constexpr member variables; I wouldn't want compilers to start emitting all of those eagerly. I know it's a relatively small expense, but it's a totally unnecessary and unwanted expense.

In general, I feel it's pretty easy to distinguish v-tables from ordinary variables.

IIUC, do you say it is not good to always generate the variable/vtable in the homing object eagerly?

Then there is a simple question for vtables. In,

export module A;
export struct A {
    virtual void func() { ... } 
};
// end of File

The vtable of A will be emitted eagerly all the time in A.o according to the change of the PR. Then it looks conflicting with the idea you're describing? Or do I misread it?

ChuanqiXu9 commented 10 months ago

BTW, I just summarized a draft paper to describe the requirement to ABI for modules: https://isocpp.org/files/papers/P3092R0.html. The discussion thread in SG15 is https://lists.isocpp.org/sg15/2024/01/2346.php. Since SG15 is public so you can chime in there if you have opinions on that.