mapbox / variant

C++11/C++14 Variant
BSD 3-Clause "New" or "Revised" License
371 stars 100 forks source link

Base recursive_wrapper implementation on std::unique_ptr #89

Open tomilov opened 8 years ago

tomilov commented 8 years ago

It is simpler to design recursive_wrapper as "wrapper" over ready-to-use to RAII std::unique_ptr. Just add value semantics (example).

joto commented 8 years ago

If I understand correctly, recursive_wrapper was implemented specifically because it was faster than using unique_ptr. Building recursive_wrapper on top of unique_ptr would defeat that purpose. On the other hand if the recursive_wrapper implementation was incomplete, that's not good either

@artemp can you shed some light on this?

One other thing to keep in mind: The implementation of recursive_wrapper was basically taken wholesale from boost, just ported to C++11. Keeping compatibility with boost is one of our goals here.

tomilov commented 8 years ago

I always thought that the use of the std::unique_ptr does not lead to any run-time performance degradation. It is as simple as possible, I sure. Keeping compatibility with boost in sense of immutability of interface (as opposite to implementation) should be sufficient, is not that so?

artemp commented 8 years ago

One other thing to keep in mind: The implementation of recursive_wrapper was basically taken wholesale from boost, just ported to C++11. Keeping compatibility with boost is one of our goals here.

Correct ^ . Our recursive_wrapper as complete as boost one in this regard.

If we can alias std::unique_ptr to recursive_wrapper and performance is not an issue than, yes, by all means lets do it. Not a priority for https://github.com/mapbox/variant/milestones/1.1.0 though. /cc @joto @tomilov @lightmare

artemp commented 8 years ago

Just reading https://github.com/mapbox/variant/issues/50 - perhaps this should go into 1.1.0 it we have a consensus ? /cc @joto @tomilov @lightmare

lightmare commented 8 years ago

:+1: - and as for throwing destructors, they're such a dark matter that I probably shouldn't have added any noexcept(something complex) to variant destructor in the first place. Just require that alternatives have non-throwing destructors and be done with it.

joto commented 8 years ago

Could we keep the discussion in this issue to the question it was started with? It is hard enough to keep track of all the different problems we have as it is.

@artemp what do you refer to in the sentence "perhaps this should go into 1.1.0"?

artemp commented 8 years ago

@joto - I referred to implementing recursive_wrapper as an alias to or based on std::unique_ptr in 1.1.0 (https://github.com/mapbox/variant/milestones/1.1.0).

cbeck88 commented 8 years ago

I think boost::recursive_wrapper is not interchangeable with std::unique_ptr.

The difference has to do with how move works and what the empty state is.

A std::unique_ptr is permitted to be vacant. Default constructing it puts it in the empty state. And moving it, simply moves the pointer, leaving the moved from object in the empty state. So it's very fast and requires no pointer dereferences.

A boost::recursive_wrapper<T> is never permitted to be vacant. Default constructing it, default constructs an instance of T on the heap. Moving it, creates a new dynamic allocation and move constructs from the T value -- it's not simply a pointer move. This is necessary for boost::variant because boost::variant is strictly never empty. If boost::recursive_wrapper can move and be left in an empty state, it would mean that visiting a boost::variant after it is moved from can be UB, since its interface pierces the recursive_wrapper transparently to the user.

If mapbox::variant works like std::variant and legitimately has and supports an empty state, then this might not be an issue -- probably you want the pointer move for efficiency. But I think you would want to explicitly set the moved-from variant to empty, not leave it holding a dead unique pointer, so that if it is visited after that, you get an exception rather than null dereference.

I guess the other reason I would advocate against basing things on std::unique_ptr just because is, variant is a pretty simple concept. In my implementation the only standard headers I bring in for the core definition are <cstddef>, <new>, <type_traits>, <utility>. On gcc the total preprocessed source for a trivial variant program is about 4000 lines of code. By contrast, bringing in one of the heavier standard library components like iostream, memory, or functional jumps you up to about 25kloc. Sure std::unique_ptr is very small and simple, but <memory> also brings in std::shared_ptr, the whole <atomic> header, etc. etc. I guess if you only need <memory> for std::unique_ptr, and you need to modify the unique_ptr anyways to do what you want, then why bother, at least that's the way I looked at it.

tomilov commented 8 years ago

@cbeck88 recursive_wrapper is wrapper around pointer (or reference) — this is essential thing for its implementation (due to wrapped type's incompleteness at point of instantiation of recursive_wrapper). Moreover, it is a pointer to object, which in turn placed (undoubtedly) on the heap. Using explicit new operator often considered as a bad practice. Therefore I can conclude using <memory> cannot be avoided, isn't it? I think <new> should never be encountered in the code for purposes to allocate some raw memory (just only for placement operator new or overloading, redefinition or something else less straitforward).

In the starting comment I gave the link to possible implementation of recursive_wrapper based on std::unique_ptr. Its emptiness is algoritmically prohibited into the very variant's logics. And I sure it is the best place for it, because (variant's) never-empty-guarantee does not apply to recursive_wrapper.

BTW I sure variant cannot be as simple as you mentioned above. It is quite entangled being.

cbeck88 commented 8 years ago

@tomilov: Yeah, I mostly agree with you. But I think it's different for library code perhaps. If I only have one explicit new (which isn't placement new) in my whole lib, for recursive_wrapper, it's ok in my book if it will improve compile times.

I wish that they had put unique_ptr in <utility> or even <new>, it would be better in one of those places than in <memory> IMO.

Re never-empty-guarantee, I guess I just wrote a documentation section about this in my lib :p so it is fresh in my mind. IMO it's not very good to say "oh, the variant isn't empty, it merely contains a container that is empty" and give the user UB by transparently dereferencing null. If you want to document "hey user, you get UB if you visit the variant after it is moved from" that's up to you, but it also introduces other holes in the implementation -- what if we assign to the variant afterwards, to bring it back to life? It means, whenever we manipulate recursive_wrapper, we have to branch on whether this pointer is null or not? IDK it just makes it more complicated. In boost::variant, recursive_wrapper is not really a container -- rather it represents a T value which is allocated on the heap, and the pointer to it is not allowed (as part of the business logic) to be null. If you change that without proper safeguards in place, it is a subtle difference in the interface I think.

Re: preprocessed source complexity:

On my machine, this program

#include <cstddef>
#include <new>
#include <utility>
#include <type_traits>

int main() {}

gives me 3225 line count with gcc 5.4.0 when I check preprocessed source size like so:

$ gcc -std=c++11 -E main.cpp | wc -l
3225

clang 3.8 result is similar:

 $ clang -std=c++11 -E main.cpp | wc -l
3631

And sloccount counts my whole lib at 1507 lines. So that's what I based my comment on.

Granted, that preprocessed source lines is definitely not the only thing that affects compile times.

Edit

With this program:

#include <strict_variant/variant.hpp>

int main() {}

I get this line count:

$ g++ -std=c++11 -E -I../include example.cpp | wc -l
5039
lightmare commented 7 years ago

Can we revisit this, please?

One of the concerns raised against unique_ptr was performance. Where is this coming from? I don't see any operations in recursive_wrapper that would be slower with unique_ptr. On the other hand, all recursive_wrapper constructors pay the cost of allocation; I'll come back to this one, because it is much worse than it looks at first glance.

Second concern was that with unique_ptr, operations on a moved-from variant would be UB (dereferencing nullptr).

If you want to document "hey user, you get UB if you visit the variant after it is moved from" that's up to you, but it also introduces other holes in the implementation -- what if we assign to the variant afterwards, to bring it back to life?

It's the other way around. Unless the documentation explicitly says an operation can be performed on a moved-from value, you should assume STL convention: such value is in an indeterminate state and should only be assigned to or destroyed.

variant<int, const char*, std::string> a, b;
a = std::string("eggs");
b = std::string("bacon");
b = std::move(a);
// at this point, what `a` contains is nobody's business
// you certainly shouldn't be visiting it

After the move, a could contain empty std::string, or a non-empty one, or a (const char*)"spam", or int(42), or be in the no_init state that mapbox::variant also has. The fact that it currently only moves the stored value, keeping the alternative type, is an implementation detail -- it can help with performance in cases where you then assign another value of the same type; especially when the type implements move assignment by swapping pointers to dynamically allocated memory.

Now back to the cost of construction.

struct Terminal {};
struct NonTerminal;
using Tree = variant<Terminal, recursive_wrapper<NonTerminal>>;
struct NonTerminal { Tree child; };

Tree grow(int n)
{
  Tree t = Terminal{};
  for (int i = 0; i < n; ++i) {
    t = NonTerminal{ std::move(t) };
  }
  return t;
}

How many NonTerminal instances will grow(n) allocate? The answer is not on the order of O(n). Each assignment to t in the loop increases the height of the tree, constructing i+1 recursive_wrappers, and each wrapper allocates a new NonTerminal. So the total number of allocations done by grow(n) is O(n**2). I'll make a PR to benchmarks showcasing this.

This is really bad for bottom-up construction of tree-like variants. Such as in mapnik expression parser.

cbeck88 commented 7 years ago

So, here's the issue with the code example you gave, where we assume recursive_wrapper has been made an alias of unique_ptr without other changes:

variant<int, const char*, recursive_wrapper<std::string>> a, b;
a = std::string("eggs");
b = std::string("bacon");
b = std::move(a);           // b's value is recursive_wrapper<std::string>,
                            // invoked the move ctor from a's value.
                            // So a's value is now in the empty state,
                            // but a itself is not -- it has the third value type still.

// Somewhere later, a is passed into a function
if (!a.empty()) {              // If it's not empty, it's safe to visit, right?
    a.visit(my_functor{});     // Ouch! Dereferenced a null unique_ptr here.
}

The problem is that, recursive_wrapper is supposed to be pierced transparently for the user. It allows you to do things like define a variant over an incomplete type, and have it all "just work". The visitors take a std::string &, not recursive_wrapper<std::string> &. If the recursive_wrapper has an empty state, then "transparently piercing it" becomes "transparently giving you UB".

However, if you put support logic for this in variant, then you probably could have

variant<int, const char*, recursive_wrapper<std::string>> a, b;
a = std::string("eggs");
b = std::string("bacon");
b = std::move(a); // Puts `a` in empty state, no dead unique_ptr
assert(a.empty());

So, I take back what I originally wrote -- I don't think basing recursive_wrapper on unique_ptr is necessarily bad, but it needs some support within your (sometimes empty) variant. (I think that it is strictly a bad idea if you have a never-empty variant.)

The performance issue you raise here is actually extremely similar IMO to something that happens when using boost::variant with the boost::spirit library. I posted an SO question once about how to get rid of these copies: http://stackoverflow.com/questions/41433234/how-to-avoid-unnecessary-copies-with-boostvariant-recursive-xml-structure

To cut a long story short, it is my opinion that boost::variant should be patched to allow moving recursive_wrapper by pointer in some cases, to avoid the extra dynamic allocations. Specifically: boost::variant has a feature called boost::blank. When boost::blank is the first template parameter, it uses it as the natural empty state for the variant, and avoids making a lot of extra dynamic allocations, instead reverting to this empty state in case of an exception during type-changing assignment etc. Basically it makes it have semantics similar to C++17 std::variant, and mapbox::variant. One might hope then that when boost::blank is present, code in which a recursive_wrapper is moved will avoid dynamic allocations and set the moved-from variant to the empty state instead. However, it's not the case currently -- it still does the "dumb" thing. I'm not sure if the reason they do this is for backwards compatibility, but as lightmare writes, code is not supposed to make assumptions about the state of a container after it is moved from, it's simply in a "valid but unspecified state". Probably they just didn't get around to it (?)

lightmare: IMO, a state in which the variant does not believe itself to be empty, but it's still illegal to visit it, is not a valid state. But, as long as you make sure the moved-from unique_ptr's get cleaned up this way, I agree that it would be an attractive optimization.

I'm hoping at some point to get around to adding this to my variant implementation -- mine is more like boost::variant semantics, but I didn't make a blank type yet, as I didn't need it and no one asked for it. But if/when I do I would definitely add this optimization.

lightmare commented 7 years ago

// Somewhere later, a is passed into a function

As it stands, the moved-from state is not documented, so it could be anything. I agree that a "broken" state is not fancy, but an unspecified state isn't any more useful, and a correct program never operates on those; the moved-from variant must be assigned a new value before you use it again / pass it around, e.g.:

a = no_init{};

I don't really care if the implementation is changed such that move constructor/assignment resets the moved-from variant to no_init right away. It'd sure be cleaner. Delaying destruction of the stored alternative for the small chance that it will be reused is probably not worth having a "broken" state.

But I digress. I wanted to focus on recursive_wrapper move constructor and assignment, those shouldn't allocate. What variant then does with null wrappers is secondary.

cbeck88 commented 7 years ago

An "unspecified but valid state" observes the basic preconditions of the object. The things you take for granted, like, it is okay to copy or move the object.

Suppose you have this code

b = std::move(a);
c = std::move(a);

If a, b, c are std::vector<T> for instance, or any other standard container, this will always work fine, and it is required to by the standard. If a, b, c were a variant type with a "broken" state, then what will happen when we move to c from the broken state?

First possibility: The variant blindly dereferences the dead pointer in a, in order to move construct it's supposed contents into c. So we got UB just from moving the container. Is this a "valid state"? Is variant no longer always going to be "move constructible"? Is there any way for the user to detect when UB would result? If not, then how can they write correct code?

Second possibility: The variant always tests if the pointer is dead before dereferencing it. But this introduces run time overhead, which you don't have when recursive_wrapper is never empty. In some applications these tests will be more expensive in the end than the dynamic allocations you are trying to avoid.

As it stands, the moved-from state is not documented, so it could be anything. I agree that a "broken" state is not fancy, but an unspecified state isn't any more useful ... Delaying destruction of the stored alternative for the small chance that it will be reused is probably not worth having a "broken" state.

I guess I don't agree with this. After a point this becomes semantics arguments, what is "valid", what do you consider to be the guarantees and the intended interface... but IMO only broken code has broken states like described. Unspecified state is definitely a lot more useful.

lightmare commented 7 years ago

If a, b, c are std::vector for instance, or any other standard container, this will always work fine, and it is required to by the standard. If a, b, c were a variant type with a "broken" state, then what will happen when we move to c from the broken state?

It'll just move the (null) pointer, no need to dereference, but I see where you're going. The broken state will spread to c, and with no means of telling whether a variant is in a broken state, will lead to UB later.

But this introduces run time overhead, which you don't have when recursive_wrapper is never empty. In some applications these tests will be more expensive in the end than the dynamic allocations you are trying to avoid.

We're talking constant-time move and constant overhead on other operations -- setting the moved-from variant to no_init state and destroying the wrapper (not the value it wraps); or checking pointers, if we allow nullptr in the wrapper (broken state) -- versus linear-time move we currently have (linear in the number of variants in the tree).

The purpose of recursive_wrapper is to be able to construct trees. Linear-time move is just as bad for variant as it would be for std::list.

cbeck88 commented 7 years ago

So, consider this example:

variant<int, recursive_wrapper<std::string>> a, b;
a = std::string{"asdf"};
b = std::move(a);
a = std::string{"jkl;"};

In general, all variants (that I know of), when a non-type changing assignment occurs, will not destruct and then reconstruct the value type -- instead, they use the copy assignment operator for that type, because it may be more efficient. I didn't read mapbox variant's implementation in this respect, but I assume it does the same.

So, when we try to resurrect the moved-from a in the above code, what it will do is, interpret the storage as recursive_wrapper<std::string>, then fetch the pointer from there and invoke std::string copy assignment operator. (Edit: Actually it will invoke the move assignment operator in this specific case, but it's not a big difference here.)

If your recursive_wrapper<std::string> doesn't check for null before this happens, then you'll get UB. Note that generally "moved from" objects can be resurrected -- they are not permanently broken. That code example is something every C++11 programmer will expect to work, regardless of what you think a "valid state" is.

So, if you allow null recursive wrapper, you need to add checks for whether or not it is null when you implement copy assignment operator of variant. (And not only here, but also other special member functions of variant.) But these checks create runtime overhead that other variants won't have. Again, sometimes tests like this won't ultimately matter if branch prediction works very well, but if you work in a more restrictive environment, e.g. embedded devices, cross-compile via emscripten for javascript, you won't have the benefit of branch prediction at all. Branch prediction aside, these extra checks may inhibit optimization by discouraging the compiler from inlining some of this code. Depending on the usage profile in the application, this may ultimately be more expensive than the dynamic allocations you are hoping to avoid in all this.

If you don't collapse all the "empty" states into one empty state, that you test for once during variant dispatch, you ultimately are going to pay a big price. I mean I don't see why you wouldn't engineer it that way.

lightmare commented 7 years ago

So, when we try to resurrect the moved-from a in the above code, what it will do is, interpret the storage as recursive_wrapper, then fetch the pointer from there and invoke std::string ~copy~move assignment operator.

Yes, not destroying the stored alternative when moving from a variant is more efficient, it's the right thing to do with non-wrapped alternatives. But when moving from a recursive_wrapper, allocating new rather than stealing the pointer to the wrapped value causes variant move construction and assignment to have linear complexity, because the whole tree structure is duplicated during the move, only in the source variant all the nodes are left holding moved-from values.

139 confirms the difference in complexity with timings, forgot to link it here.