microsoft / STL

MSVC's implementation of the C++ Standard Library.
Other
10.19k stars 1.5k forks source link

`<string>`, `<sstream>`: Is the `_String_constructor_rvalue_allocator_tag` internal constructor necessary? #4929

Closed frederick-vs-ja closed 1 month ago

frederick-vs-ja commented 2 months ago

When considering the bug reported in LLVM-101960 for MSVC STL, I found that in addition to optional, one internal constructor of basic_string is buggy in a similar way (Godbolt link). The constructor was added by #919 in responding to review comments in https://github.com/microsoft/STL/pull/919#discussion_r450596789.

Example

```C++ #include #include template constexpr bool is_basic_string_v = false; template constexpr bool is_basic_string_v> = true; template constexpr bool is_basic_string_view_v = false; template constexpr bool is_basic_string_view_v> = true; struct NastyStringConverter { template && !is_basic_string_v, int> = 0> operator T() const { return T{}; } }; int main() { // in C++17 the initializer list ctor is selected // in C++20 the internal ctor is selected, which doesn't seem conforming std::string(NastyStringConverter{}, std::allocator{}); } ```

https://github.com/microsoft/STL/blob/91e425587a0b9b5ac0d1005844c661fee2e2813b/stl/inc/xstring#L1148-L1152

Instead of fixing this constructor, I guess it might be better to remove it. As of LWG-2593, move construction must not change the value of the source allocator, so it's arguable that there shouldn't be observable difference between move and copy construction. But it seems allowed that side effects can be different.


There's a similar thing in list: https://github.com/microsoft/STL/blob/91e425587a0b9b5ac0d1005844c661fee2e2813b/stl/inc/list#L807-L810

StephanTLavavej commented 2 months ago

We talked about this at the weekly maintainer meeting, reading the relevant Standardese WG21-N4988 [stringbuf.members]/10:

Returns: A basic_string<charT, traits, Allocator> object move constructed from the basic_stringbuf's underlying character sequence in buf. This can be achieved by first adjusting buf to have the same content as view().

Overall, we were mildly concerned about disrupting overload resolution for such a popular type with machinery not directly depicted in the Standard, even if it fixes highly pathological scenarios. We believe that virtually no code anywhere cares about whether an allocator is copy-constructed or move-constructed (even though move-constructing an allocator can theoretically transfer internal caching blah blah that doesn't participate in allocator equality). The basic_stringbuf wording just says that the basic_string is "move constructed" without directly talking about the allocator.

(However, it's fine to move-construct allocators when doing so doesn't require extra internal machinery to be added, e.g. as we're doing in _Node_handle's move constructor.)

To summarize: We are in favor of removing this tagged constructor and simply copy-constructing the allocator here. We believe that this will simplify the codebase and real world code won't ever care about the difference. If anyone complains about this in the future, we can ask LWG to clarify the handwavy wording here.

frederick-vs-ja commented 2 months ago

We believe that this will simplify the codebase and real world code won't ever care about the difference.

I've found that some allocators in libcxx's test suit intentionally behave differently between copy and move construction and such differences are being tested. If we switch to copy-construct the allocator at this moment, perhaps the related tests still pass for MSVC STL, but I'm not sure whether the test will change in the future...

StephanTLavavej commented 2 months ago

We talked about that at the weekly maintainer meeting and @CaseyCarter said he wasn't super motivated by the possibility of a library test suite adding wacky allocators in the future. I said "we can cross that bridge if we come to it, we have plenty of libcxx skips and can always add more". Casey then mentioned that if we were concerned about node allocators, they can potentially have free-lists that benefit from moving, but this is an allocator for strings.