hsutter / cppfront

A personal experimental C++ Syntax 2 -> Syntax 1 compiler
Other
5.23k stars 224 forks source link

[Suggestion] Update cppfront to enable C++23 tests for modern Clang versions #1083

Closed jarzec closed 1 month ago

jarzec commented 1 month ago

In C++23, the default deleter used by std::unique_ptr has become constexpr, which requires the destructor of the pointed-to type to be available at compile time. As a result, forward-declared types cannot be used in a default std::unique_ptr. For this reason it is impossible to compile cppfront using Clang 18 and --std=c++2b. The compilation fails with something like:

In file included from source/../include/cpp2util.h:269:
In file included from /usr/bin/../lib/gcc/x86_64-linux-gnu/14/../../../../include/c++/14/memory:78:
/usr/bin/../lib/gcc/x86_64-linux-gnu/[14](https://github.com/jarzec/cppfront/actions/runs/9196887536/job/25296050363#step:5:15)/../../../../include/c++/14/bits/unique_ptr.h:91:16: error: invalid application of 'sizeof' to an incomplete type 'cpp2::is_as_expression_node'
   91 |         static_assert(sizeof(_Tp)>0,
...

(see e.g. here).

Probably the cleanest solution would be to rework the code of cppfront to remove the use of forward-declared types used in std::unique_ptr. However, doing this in a single shot would be quite cumbersome.

A stepping stone could be to use std::unique_ptr with a custom deleter that works for forward-declared types if C++>=23 is requested. This would allow to update the code iteratively, while already enabling Clang 18 with C++23.

In short, this PR addresses the issue by introducing cpp2::impl::unique_ptr and cpp2::impl::make_unique and temporarily replacing the use of the std counterparts throughout the code by those:

In subsequent commits the GitHub regression test workflows are updated and tests files are updated accordingly. As a result, Clang 18 with C++23 (and libc++-18-dev) becomes the firsts non-MSVC compiler in the CI that succeeds to build and run all the regression tests (see here). So far all other non-MSVC compilers in the CI fail to compile at least one of the tests. It is also quite likely that MSVC is able to compile cppfront with C++23 because the constexpr version of std::default_delete is not yet implemented...

The added value of the proposed solution is also the triviality of the steps necessary to revert the changes in cppfront when the code no longer has forward-declared types in unique_ptrs. Except for the single block of code added in include/cpp2util.h all other changes in the code in the PR are find/replace.

Note that the test file updates are added in the last commit. I suggest to review the main part of the changes through the first two commits: https://github.com/hsutter/cppfront/pull/1083/commits/ffe7646cfd40c1d26a37fe31e86ec7f3bb20f28f and https://github.com/hsutter/cppfront/pull/1083/commits/2a14b2e49b41196226f33a224b59bf098f93faf8. It is possible to add comments there.

hsutter commented 1 month ago

In C++23, [...] forward-declared types cannot be used in a default std::unique_ptr.

I hadn't heard about that. Hmm... that seems like it would break a lot of code.

I just checked with gcc trunk and Clang trunk and a unique_ptr of incomplete type appears to work with -std=c++2c with both: https://godbolt.org/z/Wv1rGv8az

Except only if you you actually write a custom dtor for a type like B that contains a unique_ptr to a forward-declared type (uncomment ~B() to get the error you're seeing, at least with Clang). So wouldn't it be enough to just write that destructor out-of-line later in the file? It seems like just another ordering dependency that makes us define a function out-of-line, if I'm understanding the issue correctly.

bluetarpmedia commented 1 month ago

I think it's a Clang/LLVM bug introduced (or exposed) in Clang 15.

First, with the destructor code enabled in Herb's example, it compiles correctly in Clang 14, GCC latest and MSVC latest.

Second, although I couldn't find an LLVM bug with this exact code example, there are a couple open LLVM bugs which appear to describe the same problem:

In particular, note Richard Smith's comment in the first issue above:

The unique_ptr destructor becomes constexpr in C++23, so gets instantiated more eagerly -- at the point of first reference rather than at end of TU. unique_ptr destruction doesn't support incomplete types.

I believe he's saying "so gets instantiated [by Clang] more eagerly", rather than saying that the C++ standard requires that to happen.

A statement in the second issue seems to confirm this:

Clang instantiates the bodies of constexpr functions eagerly, other compilers seem to delay the instantiation until the end of the TU

hsutter commented 1 month ago

Thanks both for the research!

Adding emphasis, Richard is quoted as saying:

unique_ptr destruction doesn't support incomplete types.

That seems to confirm the fix/workaround is to write the dtors out-of-line for any type B that has a unique_ptr<A> member where A is incomplete at the point of B's definition, correct?

jarzec commented 1 month ago

Great to hear it is just a bug in Clang. I am currently working with C++20 and 23 is still a bit of uncharted ground for me. To me it also also seemed like a big blow for existing code to break std::unique_ptr for forward declared types.

Given that we are talking about a bug in a compiler any change in the code due to that would sound like a pessimisation. Adding unnecessary destructors, in addition, seems to be going against good practice of trying to stick to the Roule of Zero whenever possible. That is what analyzers like SonarCloud suggest and it does sound reasonable to me.

I was looking for a temporary solution that would be "minimally invasive" (a buzz term from the medical industry) and trivial to remove in the long term, hence my proposition.

It is, of course, also possible just to wait for a fixed version of Clang to become available on GitHub runners. Given that the trunk version is already OK this might be a relatively near ETA.

hsutter commented 1 month ago

It's not clear to me from the description so far whether it's a compiler bug or a change in standard behavior.

But making such destructors out-of-line shouldn't a big change... moving function to later in the file is something we have to do already for a bunch of functions for forward-decl reasons(*) especially in parse.h, and it only applies to cases where the destructor (a) was user-written anyway, and (b) is on a type that has a member of unique_ptr to incomplete type, just move the dtor body out of line... for only those cases, we move the body to later in the file when everything is defined.

(*) and wouldn't once the code is migrated to to Cpp2, because all functions are forward-declared and all the definitions are out-of-line later in the file so the problem doesn't arise

jarzec commented 1 month ago

Out of curiosity I've just tried this. Possibly not even out-of-line destructors are necessary, just code reorganization.

EDIT: Fixed the link.

hsutter commented 1 month ago

Right, I think that's because the use of B triggers the implicit generation of the ~B destructor. So moving the use of B changes whether ~B is (implicitly) defined before or after A is defined.

I think it's the same as this explicitly written destructor: https://godbolt.org/z/YjMf5cnKr

bluetarpmedia commented 1 month ago

I asked about this on Discord with a slightly smaller example:

struct B;

struct A {
    ~A() { std::cout << "~A\n"; };
    std::unique_ptr<B> pb;
};

struct B {
    ~B() { std::cout << "~B\n"; };
};

int main() {
    A a;
    a.pb = std::make_unique<B>();
}

and Brian Bi (with permission to quote him) said:

It's IFNDR. ~A() calls the destructor of pb, which instantiates std::default_delete<B>::operator() before B is complete. But implementations have the freedom to defer the instantiation until the end of the TU, so you sometimes don't see a diagnostic. It's IFNDR regardless of whether it's [the unique_ptr destructor] constexpr. It seems that constexpr just affects the implementation's choice of when to instantiate in this case (but not whether or not to instantiate)

jarzec commented 1 month ago

If fact cppreference.com states that:

std::unique_ptr may be constructed for an incomplete type T, such as to facilitate the use as a handle in the pImpl idiom. If the default deleter is used, T must be complete at the point in code where the deleter is invoked, which happens in the destructor, move assignment operator, and reset member function of std::unique_ptr.

which seems quite obvious when you read it.

Nevertheless, it is interesting that all compilers seem to have had a consensus since C++11 to accept ill formed code examples we are discussing here deferring the definition o default destructors. It seems like a natural way to implement things. The fact that the issue flagged by Clang since v15 and only for C++23 sparked the above discussion suggests it might be a nice lightning talk for CppCon 😉.

Anyway @hsutter what is you take on this PR? How about I try to implement the out-of-line dtors?

hsutter commented 1 month ago

Anyway @hsutter what is you take on this PR? How about I try to implement the out-of-line dtors?

Thanks for offering! Yes, I think for this PR the best would be to start fresh and have the only diffs be to move "just enough" dtors to be out-of-line to satisfy Clang. That should means diffs only in parse.h, because that's the only file that needs out-of-line definitions because of forward declarations to break dependency cycles. I don't think there's a need for any new smart pointer or deleter. -- Would you like to do do that, in this or a fresh PR?

I notice this PR also had other changes to YML and regression test results. Those could be a second separate PR, once we merge the PR that focuses on unblocking the Clang builds?

jarzec commented 1 month ago

I will split that into two PRs, then. I am closing this one.