Closed daltenty closed 9 months ago
Plenty of discussion about this happened in https://discord.com/channels/636084430946959380/636732894974312448/1179207436532469800.
We'll summarize later once we have more info.
@daltenty dan Hi, I am new to LLVM so I just want to understand the situation better. I hope you can spend sometimes helping me with the following questions.
1) Are the function clear()
and shrink_to_fit()
meant to deallocate the current allocation of string p
, before moving the contents of value
over?
2) and now, since reserve() should not shrink, meaning it should not do any deallocation, then the function shrink_to_fit() now never deallocates, that is how the memory leak occurs?
Thanks a lot in advance!
@daltenty dan Hi, I am new to LLVM so I just want to understand the situation better. I hope you can spend sometimes helping me with the following questions.
- Are the function
clear()
andshrink_to_fit()
meant to deallocate the current allocation of stringp
, before moving the contents ofvalue
over?
In general shrink_to_fit() (and reserve() which is how shrink_to_fit() was implemented in LLVM6) are non-binding requests to change the allocation of the string, so this isn't a good pattern to use to do deallocation because it's not guaranteed and the modern code doesn't use this anymore.
In past versions of libc++ however, it did.
- and now, since reserve() should not shrink, meaning it should not do any deallocation, then the function shrink_to_fit() now never deallocates, that is how the memory leak occurs?
Yes, that's correct as of p0966r1 from C++20 being implemented. Existing code has the bad pattern though, and so since we no longer do deallocation we get the leak.
I've posted an attempt at a fix here: https://github.com/llvm/llvm-project/pull/74168
It retains the behaviour for the restricted case of the cleared long string which reserve(0), and versions the function to the new abi going forward.
So basically, it looks like we broke the ABI in 0407ab4114dbdbd1845df712639bdbc84ec6df2c by changing the fact that reserve(0)
in the dylib would free memory, which wasn't relied upon anymore by the library at the time of the commit but had been relied upon in previous versions of the headers. Nasty.
Instead, we should have retained the pre-C++20 version in the dylib and made a new C++20 version with a different mangling. I think the ship for doing that has sailed, though. If we change the behavior of reserve(0)
back to deallocating, we'll fix the memory leak with programs compiled with LLVM <= 6.0, but we will also break the ABI for all programs compiled with LLVM >= 14.0, since those may rely on the fact that the memory is not being deallocated. I don't know whether there are such programs out there and TBH it's really hard to tell, however the consequences of violating that assumption are arguably worse than a memory leak, since that would lead to a use-after-free bug, which is a security issue.
I don't think I see a way of fixing this bug without making things potentially worse, does anyone have thoughts?
Thanks very much for the analysis and example @ldionne. I see what you mean, since C++20 users have been able to rely on the new behaviour so, for example, they may expect iterator invalidation does not occur when calling reserve(0) as you say.
However, I think the code pattern in the LLVM 6.0 header might save us in this case. The string is always made empty()
before the call to reserve(0)
, so we actually don't need to revert the behaviour in all cases, only the special case where the string is empty (which is what I'm proposing in https://github.com/llvm/llvm-project/pull/74168 currently)
Off-hand, I'm not seeing how someone could be holding a valid reference to the buffer in a size() == 0
string (whatever we did to make it empty should have been iterator invalidating IIUC), though I definitely could be wrong.
Please let me know what you think.
I don't know whether there are such programs out there and TBH it's really hard to tell, however the consequences of violating that assumption are arguably worse than a memory leak, since that would lead to a use-after-free bug, which is a security issue.
That's only in terms of the implementation behaviour. Clearly, reserve
is allowed to invalidate iterators, etc. due to reallocation to increase the capacity. Unlike vector::reserve
, there is no wording to say that iterators, etc. remain valid in the absence of reallocation, and https://wg21.link/string.require#4 provides permission for the invalidation to occur. It appears C++23 programs should be written only to rely on non-reallocation in terms of performance (and not functional correctness).
It's a difficult argument to make that data()
should point somewhere else in the absence of reallocation though.
Off-hand, I'm not seeing how someone could be holding a valid reference to the buffer in a
size() == 0
string (whatever we did to make it empty should have been iterator invalidating IIUC), though I definitely could be wrong.
The issue is that the iterator may have been latched between the operation to make the string empty and the call to reserve
.
#include <string>
extern "C" int printf(const char *, ...);
std::string longstring(
"12345678901234567890123456789012"
"12345678901234567890123456789012"
);
int main(void) {
longstring.clear();
const auto lsend = longstring.end();
longstring.reserve(0);
const auto lsbegin2 = longstring.begin();
const auto lsend2 = longstring.end();
printf("%ld\n", std::distance(lsbegin2, lsend2));
printf("%ld\n", std::distance(lsbegin2, lsend)); // should be 0 _if_ invalidation is not allowed for non-reallocating `reserve`
}
Given the discussion and potential risks of the fix identified in this item, I've mark this as Won't Fix
The following test program exhibits memory leaks when built with libc++ versions <= 6.0 and run against newer libc++ shared/dylibs:
Previous versions of libc++ before 7.0 used clear and shrink_to_fit in the move constructor to release the existing allocation as part of the move:
https://github.com/llvm/llvm-project/blob/d359f2096850c68b708bc25a7baca4282945949f/libcxx/include/string#L2106
In the same versions, shrink_to_fit() is implemented as an inline call to reserve().
Unfortunately, the C++ standard changed the semantics of reserve() in https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0966r1.html to never deallocate, which libc++ now implements (https://github.com/llvm/llvm-project/commit/4d64d7dd641d385439a354db1f98cbb0ab7167f9).
The deallocation was non-binding according to the standard, but unfortunately the string header implementation prior to LLVM 6.0 depends on that behaviour being present in the shared/dylib.