jbcoe / value_types

value types for composite class design - with allocators
MIT License
41 stars 14 forks source link

Usability issues when composing indirect<T> with incomplete T? #471

Open jan-moeller opened 1 week ago

jan-moeller commented 1 week ago

I have been playing around with indirect with incomplete types and found a bit of compiler-specific divergence, and usability questions.

Compiler Explorer

struct bar;

struct foo {
    xyz::indirect<bar> expr;
};

struct bar {};

int main() { foo b; }

This compiles on gcc, but is rejected by clang and msvc. I'm not an expert, but I could imagine that clang and msvc are correct here, since at the closing semicolon of the foo definition, the compiler needs to determine which of the special member functions to generate; that requires instantiation of indirect's constructors, which has to fail at that point.

Essentially, my concern here is that the template <typename TT = T> trick defers instantiation until the point of use, but if I want to compose indirect in my own type, then I have to repeat the same trick, because the compiler-generated functions don't know about it. In other words, the trick doesn't compose naturally.

For example like this, which compiles on all 3 major compilers:

struct bar;

struct foo {
    xyz::indirect<bar> b;

    template <typename T = foo>
    foo() {}

    template <typename T = foo>
    foo(T const& other) : b(other.b){};

    template <typename T = foo>
    foo& operator=(T const& other) {
        b = other.b;
        return *this;
    };

    foo(foo const& other) = delete;
    foo& operator=(foo const& other) = delete;
};

struct bar {};

int main() { foo b; }

This is likely not obvious for a lot of users.

If you put an indirect with incomplete T inside a variant or any other type that has requirements at type instantiation time, then we run into the same issue.

struct foo {};
struct bar;

struct baz {
    std::variant<foo, xyz::indirect<bar>> b;
};

struct bar {};

int main() { baz b; }

I'm mentioning this separately since I imagine this to be a common enough usecase: You have one of a fixed set of types, but one of them is incomplete, maybe because the data structure is recursive. Intuitively, this should work (and it does if you replace indirect with unique_ptr).

Again, I could go and define the usual members appropriately myself to fix this, but if I do that, then I might as well just use unique_ptr where I have to do the same thing if I want deep copy.

Unfortunately, I don't have a great idea how to make this better. So, unless gcc is right after all: Would it be better not to support incomplete types? But then again, there might be other compelling usecases that I am not aware of.

Oh, and since this all reads very negative: great work! I actually love indirect!

jbcoe commented 6 days ago

Thanks for taking the time to play around with indirect and post an issue.

The issue you raise is interesting. While an indirect<T> member with T as an incomplete type is allowed at type-instantiation time, indirect<T> is in a bit of a funny state itself at that point so can't be a template type argument for anything which has certain type-instantiation-time requirements. I certainly would not want the trick we used in indirect to be liberally employed elsewhere as you rightly suggested.

    template <typename T = foo>
    foo() {}

The first part of your issue can be solved by declaring and later defaulting the special member functions, like we do in our compile checks compile_checks/indirect_pimpl.h, compile_checks/indirect_pimpl.cc.

#include <https://raw.githubusercontent.com/jbcoe/value_types/main/indirect.h>

struct bar;

struct foo {
    xyz::indirect<bar> expr;
    foo();
    foo(const foo&);
    foo(foo&&);
    foo& operator=(const foo&);
    foo& operator=(foo&&);
};

struct bar {};

foo::foo() = default;
foo::foo(const foo&) = default;
foo::foo(foo&&) = default;
foo& foo::operator=(const foo&) = default;
foo& foo::operator=(foo&&) = default;

int main() { foo b; }

I'll ponder variant further. Composites without requirements (like vector) work as you'd like.

Thanks for the kind words. I'd like to understand your use cases better to see how things might be improved.

jbcoe commented 6 days ago

Digging in a bit deeper, a union does not seem to have the same issues as a variant:

#include <https://raw.githubusercontent.com/jbcoe/value_types/main/indirect.h>

struct foo {};
struct bar;

struct baz {    
    baz();
    ~baz();
    baz(const baz&);
    baz(baz&&);
    baz& operator=(const baz&);
    baz& operator=(baz&&);

    union U { 
        foo f; 
        xyz::indirect<bar> i;
        U();
        ~U();
        U(const U&);
        U(U&&);
        U& operator=(const U&);
        U& operator=(U&&);
    };

    U b;
};

struct bar {
    ~bar() = delete;
};

int main() { baz b; }

https://godbolt.org/z/1cfGKddxb

I wonder where the additional type requirements are coming from for variant?

jan-moeller commented 6 days ago

The first part of your issue can be solved by declaring and later defaulting the special member functions

Ah, you're certainly right. That's actually the same thing one already has to do with the destructor/move constructor of a type containing a unique_ptr of an incomplete type, so not really a disadvantage of indirect over unique_ptr. Thanks for pointing that out!

The usecase I mostly had in mind (and with which I was experimenting with) was that of an AST-like datastructure. Say we write a super simple calculator which supports binary operators and numbers as operands. Ideally, the datastructure would look like this:

struct number {
    int value;
};

struct binop {
    char op;
    expression lhs;
    expression rhs;
};

struct expression {
    variant<number, binop> info;
};

This obviously can't work, we need to break the recursive cycle somehow. Maybe the most straightforward solution is this:

struct number {
    int value;
};

struct binop;

struct expression {
    variant<number, indirect<binop>> info;

    // TODO: some API to hide that binop is stored on the heap
};

struct binop {
    char op;
    expression lhs;
    expression rhs;
};

This fails to compile on clang and msvc.

Replacing indirect with unique_ptr or an indirect-like type without concept checks compiles fine:

template <typename T>
class my_indirect {
   public:
    constexpr my_indirect() : my_indirect(T()) {}
    explicit constexpr my_indirect(T value)
        : m_ptr(std::make_unique<T>(std::move(value))) {}

    constexpr my_indirect(my_indirect const& other)
        : m_ptr(std::make_unique<T>(*other)) {}

    constexpr my_indirect(my_indirect&& other) noexcept
        : m_ptr(std::move(other.m_ptr)) {}

    constexpr auto operator=(my_indirect const& other) -> my_indirect& {
        *m_ptr = *other;
        return *this;
    }

    constexpr auto operator=(my_indirect&& other) noexcept -> my_indirect& {
        m_ptr.swap(other.m_ptr);
        return *this;
    }

    constexpr ~my_indirect() = default;

    // TODO: operator*, ->

   private:
    std::unique_ptr<T> m_ptr;
};

struct number {
    int value;
};

struct binop;

struct expression {
    variant<number, my_indirect<binop>> info;

    // TODO: some API to hide that binop is stored on the heap
};

struct binop {
    char op;
    expression lhs;
    expression rhs;
};

I wonder where the additional type requirements are coming from for variant?

My understanding is that, since variant's copy constructor is required to be deleted unless all Ts are copy constructible, all copy constructors of all Ts have to be instantiated immediately. Similar rules apply for the move constructor and destructor.

Unlike indirect, unique_ptr doesn't have any concept checks (related to the type it's templated on) on any of these (the copy constructor just doesn't exist, and the other two only check properties of the deleter. So that's why unique_ptr and my_indirect work inside variant.

The more I think of it, the less it appears to me that this is an issue with indirect, and more of a shortcoming of variant (or rather a design trade-off). variant just doesn't work very well with types that need to check properties of incomplete types on copy-, move constructor and destructor.

Why gcc compiles the variant<indirect> usecase just fine, I have no idea. But I assume it might have implemented variant's special members with compiler magic, and the other two compilers might have not.

As to the union example you presented: I have to admit that the rules under what circumstances union special members are compiler-provided have always seemed arcane to me. In any case, they most likely fall in the same compiler magic category that I assume gcc might also apply to variant.

The somewhat sad downside of this is that I will probably have to continue roll my own indirect for the variant usecase. But I don't see how to fix that, unless one were to mandate compilers to make variant magic.

jbcoe commented 6 days ago

I’m more optimistic. This use case is important to me and I’ll see what we can do to make variant play better with indirect and polymorphic.

jan-moeller commented 6 days ago

I'm glad I brought it up, then! Thank you for your work :+1:

lewissbaker commented 5 days ago

Looking at the compile-errors in your godbolt link, it seems there are two causes for the compile-errors.

The first is that the order of constraints in the indirect::operator=(U&& u) overload have placed the constraint that should eliminate it from consideration if the RHS is of type indirect at the end. Ideally it should be placed as the first constraint so that we short-circuit and avoid evaluating the other constraints.

Applying the following change seems to make it compile:

   template <class U>
   constexpr indirect& operator=(U&& u)
-    requires(std::is_constructible_v<T, U> && std::is_assignable_v<T&, U> &&
-             !std::is_same_v<std::remove_cvref_t<U>, indirect>)
+    requires (!std::same_as<std::remove_cvref_t<U>, indirect>) &&
+             std::constructible_from<T, U> &&
+             std::assignable_from<T&, U>
   {
     if (valueless_after_move()) {
       p_ = construct_from(alloc_, std::forward<U>(u));
     } else {
       *p_ = std::forward<U>(u);
     }
     return *this;
   }

NOTE: Prefer using the concepts in the constraints rather than traits where available as this will give better diagnostics in compile-errors.

This fixes the above example so that it compiles: https://godbolt.org/z/8sj3rWed4

However, the second issue I saw was that the indirect default constructor was using constraints rather than mandates/static_assert.

Won't this cause placing an indirect<incomplete> as a data-member of a plain struct to fail to compile as this will try to determine whether it should add a declaration of a default-constructor to the plain struct?

This could also be seen if you swap the order of the types in the variant in the example above, so that the indirect type is first, and is thus default-constructed when the variant is default-constructed.

Also, the use of traits, like std::is_default_constructible_v<T> require that T is a complete type and your program is ill-formed if you try to ask the question when T is incomplete.

However, if you use std::default_initializable<T> then if T is incomplete then the constraint will simply fail to be satisfied, and the method is eliminated from the overload-set.

So, if you change your default constructor as follows:

   template <class TT = T>
   explicit constexpr indirect()
-    requires(std::is_default_constructible_v<A> &&
-             std::is_default_constructible_v<TT> &&
-             std::is_copy_constructible_v<TT>)
+    requires std::default_initializable<A> &&
+             std::default_initializable<TT> &&
+             std::copy_constructible<TT>
       : indirect(std::allocator_arg, A{}) {}

Then you can compile the following snippet:

#include <variant>
#include <concepts>

using namespace std;
using namespace xyz;

struct number {
    int value = 0;
};

struct binop;

struct expression {
    variant<indirect<binop>, number> info;
};

struct binop {
    char op;
    expression lhs;
    expression rhs;
};

static_assert(!std::default_initializable<expression>);

int main() {
    expression exp{number{42}};
}

But note that the expression type is no longer default-initializable. This is because at the end of expression definition, the compiler checks if it should generate a default constructor, which checks if the variant is default constructible, which checks if indirect<binop> is default constructible (as it's the first alternative type) which checks the constraints above, which fails to satisfy the constraints because at this point, binop is still incomplete.

This kind of makes me wonder if we shouldn't be using "mandates" for the default constructor, rather than constraints. This would potentially give the answer that some types are default constructible when they actually aren't, but would let indirect<incomplete> actually be default constructible in this case.

Also, it's not clear why you are using "Mandates: is_copy_constructible_v<T>" on the copy-constructor, but "Constraints: is_copy_constructible_v<T>" on the default constructor. Should all of the constructors simply use "Mandates: is_copy_constructible_v<T>"?

jbcoe commented 4 days ago

Thanks Lewis

I need to update indirect (and polymorphic) following the updated spec #464.

@lewissbaker I'll get back to you once spec and reference implementation are aligned.

jbcoe commented 2 days ago

The fix applied in #472 does not fix indirect-in-variant on MSVC. More work needed.

https://godbolt.org/z/rccYo1fqj

jbcoe commented 1 day ago

Replacing type_traits with concepts (as @lewissbaker suggested) in #475 fixes MSVC issues.

This may need a Library issue to be submitted as it's a change to the design (albeit a necessary one).

jbcoe commented 1 day ago

@lewissbaker re:

"This kind of makes me wonder if we shouldn't be using "mandates" for the default constructor, rather than constraints. This would potentially give the answer that some types are default constructible when they actually aren't, but would let indirect actually be default constructible in this case."

I'm keen for traits not to lie. indirect is copy-constructible by design but not all indirect's need to be default constructible.

jbcoe commented 1 day ago

Fixed in #475

We'll submit a library issue to update the spec.

jbcoe commented 1 hour ago

Update: we are continuing to look into this in https://github.com/jbcoe/value_types/pull/478