rui314 / mold

Mold: A Modern Linker 🦠
MIT License
13.69k stars 448 forks source link

Reference counting of std::shared_ptr is non-atomic when using the mold linker #1286

Closed dutor closed 2 weeks ago

dutor commented 2 weeks ago

First of all, thanks for this great linker. It's blazing fast and saves me a lot of build time.

My environment:

Simplified code to reproduce this issue:

// file main.cpp
#include <stdio.h>
#include <memory>

using SP = std::shared_ptr<int>;
// run() will launch multiple threads to incr/decr the refcnt of the given shared_ptr.
// upon return from run() all copies of the shared_ptr will be released.
void run(SP sp, SP(*)(const SP&));
SP copy(const SP &sp) {
    return sp;
}
int main() {
    auto sp = std::make_shared<int>(0);
    run(sp, copy);
    // Here we expect the use_count() is 1
    fprintf(stderr, "use_count: %lu\n", sp.use_count());
    return 0;
}

// file run.cpp
#include <stdio.h>
#include <memory>
#include <thread>
#include <vector>

using SP = std::shared_ptr<int>;

void run(SP sp, SP (*copy)(const SP&)) {
    static constexpr auto N = 4UL;
    std::vector<std::thread> threads;
    threads.reserve(N);

    auto func = [=] (SP ptr) {
        static constexpr auto kLoops = 5000000UL;
        std::vector<SP> sps;
        sps.reserve(kLoops);
        for (auto i = 0UL; i < kLoops; i++) {
            sps.push_back(copy(ptr));
        }
    };

    for (auto i = 0UL; i < N; i++) {
        threads.emplace_back(std::thread(func, sp));
    }

    for (auto i = 0UL; i < N; i++) {
        threads[i].join();
    }
}

Build & run

$ export MOLD_PATH=./usr/libexec/mold
$ g++ -pthread -B$MOLD_PATH -fPIC -shared run.cpp -o librun.so
$ g++ -pthread -B$MOLD_PATH main.cpp -L. -lrun  -o main
$ ./main
use_count: 2199319

Some facts:

Thanks in advance.

rui314 commented 2 weeks ago

Thank you for your report. This is a very difficult issue and arguably a bug in glibc rather than the linker. At least the code is very fragile as it depends on when a weak symbol is resolved.

Even with GNU ld, if you compile your main executable with g++ -pthread main.cpp ./librun.so -o main -fno-PIC -no-pie -Wl,-allow-shlib-undefined, the output is broken just like mold's output. Or, if you use LLVM lld and compile with g++ -pthread main.cpp ./librun.so -o main -fno-PIC -no-pie, the result is the same.

Let me think more about how to fix this. By the way, how did you find this problem?

rui314 commented 2 weeks ago

Please try again with git head.

dutor commented 1 week ago

Thanks for the reply and fix!

By the way, how did you find this problem?

We have experienced several occasional memory issues, like heap-use-after-free on the control block of shared_ptr and memory leaks on resources managed by shared_ptr. So we tracked down each reference counting operation of shared_ptr and found out that the atomicity was compromised. Then the runtime atomic dispatch and pthread weak symbol things, etc.

Interestingly, at the beginning we fixed this bug by linking against libpthread explicitly with --no-as-needed for every binary(just like the fix https://github.com/rui314/mold/commit/06b592683c150a18d7808e6a91387c0393fa849b). But that seams not encouraged as per the -pthread option. Then we discovered that other linkers like ld and gold dont have these issue(for our build options).

This is a very difficult issue and arguably a bug in glibc rather than the linker

I'm not a toolchain guy. Is there any related discussion on this problem? Maybe I can have more understanding on this.

rui314 commented 1 week ago

I wrote the explanation of the issue as the commit message, so you may want to read it first if you want to understand it more. Feel free to ask any questions!

rui314 commented 1 week ago

I think this is worth making a new release. I'll be releasing mold 2.32.1 soon.