llvm / llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies.
http://llvm.org
Other
28.61k stars 11.82k forks source link

[libc++] basic_string move assignment can leak memory #73683

Closed daltenty closed 9 months ago

daltenty commented 10 months ago

The following test program exhibits memory leaks when built with libc++ versions <= 6.0 and run against newer libc++ shared/dylibs:

$ cat test.cc
#include <string>

std::string* p;

void SetNoArena(::std::string&& value)
{
        *p = std::move(value);
}

int main()
{
        p = new ::std::string("Hello World!");

        const char value1[128] = "A long enough string for heap allocation";

        while(true){
                SetNoArena( ::std::string(value1) );
        }
        delete p;
        return 0;
}
$clang++ -nostdinc++ -isystem ~/libcxx-6.x/usr/local/include/c++/v1/ -nostdlib++ -L ~/libcxx-6.x/usr/local/lib/ -lc++ -lc++abi test.c
$export LD_LIBRARY_PATH=${HOME}/libcxx-17.x/usr/local/lib/:/usr/lib:/lib
$ ./a.out &
[1] 581514
$ ps -o pid,args,pmem
    PID COMMAND                     %MEM
 581514 ./a.out                      0.7
...
$ ps -o pid,args,pmem
    PID COMMAND                     %MEM
 581514 ./a.out                      0.8
...
$ ps -o pid,args,pmem
    PID COMMAND                     %MEM
 581514 ./a.out                      0.9

Previous versions of libc++ before 7.0 used clear and shrink_to_fit in the move constructor to release the existing allocation as part of the move:

    clear();
    shrink_to_fit();

https://github.com/llvm/llvm-project/blob/d359f2096850c68b708bc25a7baca4282945949f/libcxx/include/string#L2106

In the same versions, shrink_to_fit() is implemented as an inline call to reserve().

Unfortunately, the C++ standard changed the semantics of reserve() in https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0966r1.html to never deallocate, which libc++ now implements (https://github.com/llvm/llvm-project/commit/4d64d7dd641d385439a354db1f98cbb0ab7167f9).

The deallocation was non-binding according to the standard, but unfortunately the string header implementation prior to LLVM 6.0 depends on that behaviour being present in the shared/dylib.

ldionne commented 10 months ago

Plenty of discussion about this happened in https://discord.com/channels/636084430946959380/636732894974312448/1179207436532469800.

We'll summarize later once we have more info.

changkhothuychung commented 10 months ago

@daltenty dan Hi, I am new to LLVM so I just want to understand the situation better. I hope you can spend sometimes helping me with the following questions.

1) Are the function clear() and shrink_to_fit() meant to deallocate the current allocation of string p, before moving the contents of value over?

2) and now, since reserve() should not shrink, meaning it should not do any deallocation, then the function shrink_to_fit() now never deallocates, that is how the memory leak occurs?

Thanks a lot in advance!

daltenty commented 10 months ago

@daltenty dan Hi, I am new to LLVM so I just want to understand the situation better. I hope you can spend sometimes helping me with the following questions.

  1. Are the function clear() and shrink_to_fit() meant to deallocate the current allocation of string p, before moving the contents of value over?

In general shrink_to_fit() (and reserve() which is how shrink_to_fit() was implemented in LLVM6) are non-binding requests to change the allocation of the string, so this isn't a good pattern to use to do deallocation because it's not guaranteed and the modern code doesn't use this anymore.

In past versions of libc++ however, it did.

  1. and now, since reserve() should not shrink, meaning it should not do any deallocation, then the function shrink_to_fit() now never deallocates, that is how the memory leak occurs?

Yes, that's correct as of p0966r1 from C++20 being implemented. Existing code has the bad pattern though, and so since we no longer do deallocation we get the leak.

daltenty commented 10 months ago

I've posted an attempt at a fix here: https://github.com/llvm/llvm-project/pull/74168

It retains the behaviour for the restricted case of the cleared long string which reserve(0), and versions the function to the new abi going forward.

ldionne commented 10 months ago

So basically, it looks like we broke the ABI in 0407ab4114dbdbd1845df712639bdbc84ec6df2c by changing the fact that reserve(0) in the dylib would free memory, which wasn't relied upon anymore by the library at the time of the commit but had been relied upon in previous versions of the headers. Nasty.

Instead, we should have retained the pre-C++20 version in the dylib and made a new C++20 version with a different mangling. I think the ship for doing that has sailed, though. If we change the behavior of reserve(0) back to deallocating, we'll fix the memory leak with programs compiled with LLVM <= 6.0, but we will also break the ABI for all programs compiled with LLVM >= 14.0, since those may rely on the fact that the memory is not being deallocated. I don't know whether there are such programs out there and TBH it's really hard to tell, however the consequences of violating that assumption are arguably worse than a memory leak, since that would lead to a use-after-free bug, which is a security issue.

I don't think I see a way of fixing this bug without making things potentially worse, does anyone have thoughts?

daltenty commented 10 months ago

Thanks very much for the analysis and example @ldionne. I see what you mean, since C++20 users have been able to rely on the new behaviour so, for example, they may expect iterator invalidation does not occur when calling reserve(0) as you say.

However, I think the code pattern in the LLVM 6.0 header might save us in this case. The string is always made empty() before the call to reserve(0), so we actually don't need to revert the behaviour in all cases, only the special case where the string is empty (which is what I'm proposing in https://github.com/llvm/llvm-project/pull/74168 currently)

Off-hand, I'm not seeing how someone could be holding a valid reference to the buffer in a size() == 0 string (whatever we did to make it empty should have been iterator invalidating IIUC), though I definitely could be wrong.

Please let me know what you think.

hubert-reinterpretcast commented 10 months ago

I don't know whether there are such programs out there and TBH it's really hard to tell, however the consequences of violating that assumption are arguably worse than a memory leak, since that would lead to a use-after-free bug, which is a security issue.

That's only in terms of the implementation behaviour. Clearly, reserve is allowed to invalidate iterators, etc. due to reallocation to increase the capacity. Unlike vector::reserve, there is no wording to say that iterators, etc. remain valid in the absence of reallocation, and https://wg21.link/string.require#4 provides permission for the invalidation to occur. It appears C++23 programs should be written only to rely on non-reallocation in terms of performance (and not functional correctness).

It's a difficult argument to make that data() should point somewhere else in the absence of reallocation though.

hubert-reinterpretcast commented 10 months ago

Off-hand, I'm not seeing how someone could be holding a valid reference to the buffer in a size() == 0 string (whatever we did to make it empty should have been iterator invalidating IIUC), though I definitely could be wrong.

The issue is that the iterator may have been latched between the operation to make the string empty and the call to reserve.

#include <string>

extern "C" int printf(const char *, ...);

std::string longstring(
"12345678901234567890123456789012"
"12345678901234567890123456789012"
);

int main(void) {
  longstring.clear();
  const auto lsend = longstring.end();
  longstring.reserve(0);
  const auto lsbegin2 = longstring.begin();
  const auto lsend2 = longstring.end();
  printf("%ld\n", std::distance(lsbegin2, lsend2));
  printf("%ld\n", std::distance(lsbegin2, lsend));  // should be 0 _if_ invalidation is not allowed for non-reallocating `reserve`
}
daltenty commented 9 months ago

Given the discussion and potential risks of the fix identified in this item, I've mark this as Won't Fix