Open lightmare opened 7 years ago
I decided to turn this PR into an actual fix for the issue demonstrated by the added benchmark.
Rather than going the "make recursive_wrapper
a unique_ptr
with copy" route, I only did a small change -- make recursive_wrapper
's move constructor act like unique_ptr
's, and in variant_helper::move
destroy the moved-from value afterwards (regardless of whether it's a recursive_wrapper
or not).
This would be a set-back for other variant implementations that optimize same-type assignment, but not for this one. mapbox::variant
's assignment unconditionally destroys the active alternative first, so the worst-case slowdown this change incurs is equivalent to helper_type::destroy(detail::invalid_index, nullptr);
.
Hmm, just a thought: you might want to rename the "variant_helper::move" function to variant_helper::destructive_move
, since that's what it does now? Similar to this proposal: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4158.pdf
Or variant_helper::move_and_destroy
?
Another thought: Your patch is a lot smaller and simpler than I expected :D
But one of the reasons for this is that you change the behavior slightly in all cases, not only when recursive_wrapper
occurs.
For instance, suppose I have some code
variant<int, std::string> a, b, c;
c = std::string{"foo"};
b = std::move(c);
a = std::move(b);
c = std::string{"bar"};
b = std::move(c);
a = std::move(b);
In current code, the last three lines of code will all end up invoking move assignment operator of std::string. After this patch, it's sometimes the move constructor instead, because after c and b are moved from their things get destroyed.
The reason C++ has copy assignment operators and move assignment operators is that it may in some cases be more efficient than simply destroying and using a constructor. The object may need to acquire resources, or populate a small lookup table or something.
In most cases though it's not a big difference. Also, your way of doing it is certainly valid under the usual definition of move semantics, the programmer can only assume a "valid state" after the move.
I wonder if there's any non-artificial examples that would be slowed down by this.
Potentially, you could try to make it so that it only does the "move and destroy" logic when the type is a recursive_wrapper
. That means that if move assignment of the user type is faster than destroy and move construct, they get to retain that optimization opportunity in more cases. But it would also be a little more complicated to program.
In current code, the last three lines of code ...
... destroy the target first, then move-construct a new string. https://github.com/mapbox/variant/blob/d2588a8f1d6b5d480d228e6d8a906ce634bdea9a/include/mapbox/variant.hpp#L629
After this PR, they receive an already-destroyed target, so the call to helper_type::destroy
(that just recurses through all alternatives doing nothing) is newly introduced overhead. Which also kinda answers your question about non-artificial examples. This affects every assignment to a moved-from variant, e.g. std::swap
does that twice, so it's not uncommon.
Potentially, you could try to make it so that it only does the "move and destroy" logic when the type is a recursive_wrapper. That means that if move assignment of the user type is faster than destroy and move construct, they get to retain that optimization opportunity in more cases. But it would also be a little more complicated to program.
Yes, that's what I was aiming for, but then noticed that assignment unconditionally destroys the target first, so until that is changed (and I didn't want to extend the scope of this PR), it wouldn't buy much.
Hmm right you are, that is a quirk of mapbox variant I guess
Look at the times for "construction" of large trees. With
recursive_wrapper
a 5x increase in tree size results in1485368 / 57397 > 25x
increase in running time. Withunique_ptr
the increase in running time is only171 / 41 > 4x
.There's one strange surprise, though. The first timing below TYPE is from an existing calculation test, and
unique_ptr
is almost twice as fast after the change (which doesn't touch any of that code), whereas before the change, there was no difference:That's with
gcc-6.2
. I previously tried with 4.8 and 5.4 and there were differences but not really convincing;recursive_wrapper
was faster, then I slightly changed the code to only construct onetest::calculator
instead of constructing temporaries for every operation, and suddenlyunique_ptr
was faster; then I reverted that change, added thebench_large_tree
function (without calling it) and the times were almost identical.