sewenew / redis-plus-plus

Redis client written in C++
Apache License 2.0
1.52k stars 348 forks source link

[BUG] coredump happened in case connect_callback with exception for async mode #573

Open lisay-yan opened 1 week ago

lisay-yan commented 1 week ago

Describe the bug

in AsyncConnection::connect_callback, there throw error during handle different state, my case is throw error during State::CONNECTING, then program fire SegV.

To Reproduce To simple reproduce, you can use TLS, provide CA file which didn't exist, then , core dump will happen

Expected behavior Program shouldn't crash.

Environment:

Additional context I think it is a multiple thread issues. AsyncConnection::connect_callback, handle exception via AsyncConnection::disconnect, then redisAsyncContext is freed. While, it seems libuv still not aware of context invalid, and still try to access, then segV.

bt

0 0x00007f1a8a28799e in redisAsyncHandleWrite (ac=0x7f1a8000dee0) at async.c:687

1 0x00007f1a884f6f2b in uv__io_poll (loop=loop@entry=0x153e100, timeout=)

at src/unix/linux.c:1526

2 0x00007f1a884e5a4b in uv_run (loop=0x153e100, mode=UV_RUN_DEFAULT) at src/unix/core.c:447

3 0x00007f1a89e27330 in std::(anonymous namespace)::execute_native_thread_routine (__p=)

at ../../../../../libstdc++-v3/src/c++11/thread.cc:84

4 0x00007f1a8b24bea5 in start_thread (arg=0x7f1a86a1b700) at pthread_create.c:307

5 0x00007f1a8958ab0d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

(gdb) f 0

0 0x00007f1a8a28799e in redisAsyncHandleWrite (ac=0x7f1a8000dee0) at async.c:687

687 c->funcs->async_write(ac);

Below shown details, c->funcs had been invalid.

p (redisAsyncContext )ac $3 = {c = {funcs = 0xffffffffffffffff, err = -1,...

All in all, redis-plus-plus seems is not thread safe when interact with libuv at rainy cases.

sewenew commented 3 days ago

Can you try the latest redis-plus-plus? I cannot reproduce your problem with the latest code on master branch.

The following is my test code:

        try {
                ConnectionOptions connection_options;
                connection_options.host = "127.0.0.1";
                connection_options.port = 6379;
                connection_options.tls.enabled = true;
                connection_options.tls.cacert = "./tests/tls/cacert-does-not-exist.crt"; 
                connection_options.tls.cert = "./tests/tls/redis.crt";
                connection_options.tls.key = "./tests/tls/redis.key";
                auto redis = AsyncRedis(connection_options);
                cout << redis.ping().get() << endl;
        } catch (const Error &err) {
                std::cout << "catch error=" << err.what() << std::endl;
        }

I specified a cacert file that does not exist: cacert-does-not-exist.crt, the code throws exception but does not trigger a segment fault. The following is the output message:

catch error=failed to create TLS context: Failed to load CA Certificate or CA Path

Sorry for the late reply, too busy these days...

Regards

lisay-yan commented 3 days ago

ok, I miss a key point, I send a "cluster info" after create client to understand server state. Would you like retry with any redis traffic after client created?

It is a multiple thread issue...

sewenew commented 3 days ago

Sorry, but I still cannot reproduce your problem in Muti-thread env (Check the following code). Can you give me a minimum code that reproduces your problem?

        try {
                ConnectionOptions connection_options;
                connection_options.host = "127.0.0.1";
                connection_options.port = 30001;
                connection_options.tls.enabled = true;
                connection_options.tls.cacert = "./tests/tls/missing-cacert.crt";
                connection_options.tls.cert = "./tests/tls/redis.crt";
                connection_options.tls.key = "./tests/tls/redis.key";
                auto redis = AsyncRedisCluster(connection_options);
                vector<thread> workers;
                for (auto i = 0; i < 30; ++i) {
                    workers.emplace_back([&redis]() {
                            for (auto j = 0; j < 10; ++j) {
                                try {
                                    //redis.ping().get();
                                    redis.redis(to_string(j)).ping().get();
                                } catch (const Error &e) {
                                    std::cout << "error=" << e.what() << std::endl;
                                }
                            }
                        });
                }
                for (auto &w : workers) {
                    w.join();
                }
        } catch (const Error &err) {
                std::cout << "catch error=" << err.what() << std::endl;

Regards

lisay-yan commented 2 days ago

@sewenew

I use your code, coredump still is happened in my test env. And share my build options, it is via c++11. g++ -g -o redisTest main.cxx -lhiredis -lredis++ -lhiredis_ssl -lssl -lcrypto -pthread -std=c++11

   ConnectionOptions connection_options;
    connection_options.host = "127.0.0.1";
    connection_options.port = 8011;
    connection_options.tls.enabled = true;
    connection_options.tls.cacert = "/tmp/tls/ca2.crt"; //not exist in path
    connection_options.tls.cert = "/tmp/tls/client.crt";
    connection_options.tls.key = "/tmp/tls/client.key";
    AsyncRedisCluster *client = new AsyncRedisCluster(connection_options);
    vector<thread> workers;
    for (auto i = 0; i < 30; ++i)
    {
            workers.emplace_back([&client]()
                                 {
                        for (auto j = 0; j < 10; ++j) {
                            try {
                                //redis.ping().get();
                                client->redis(to_string(j)).ping().get();
                            } catch (const Error &e) {
                                std::cout << "error=" << e.what() << std::endl;
                            }
                        } });
    }
    for (auto &w : workers)
    {
            w.join();
    }

    return;

FBT

gdb -c ./core.2313 ./redisTest

(gdb) bt

0 0x00007f3d3c006240 in ?? ()

1 0x00007f3d6ba1c9a1 in redisAsyncHandleWrite (ac=0x7f3d3c001020) at async.c:687

2 0x00007f3d6a066f2b in uv__io_poll (loop=loop@entry=0x6fbab0, timeout=) at src/unix/linux.c:1526

3 0x00007f3d6a055a4b in uv_run (loop=0x6fbab0, mode=UV_RUN_DEFAULT) at src/unix/core.c:447

4 0x00007f3d6ae2c330 in std::(anonymous namespace)::execute_native_thread_routine (__p=)

at ../../../../../libstdc++-v3/src/c++11/thread.cc:84

5 0x00007f3d6a64aea5 in start_thread (arg=0x7f3d68806700) at pthread_create.c:307

6 0x00007f3d6a373b0d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

sewenew commented 2 days ago

Which version of redis-plus-plus do you use? Can you test it with the latest code on master branch?

Regards

lisay-yan commented 2 days ago

I tried use latest master branch, coredump kept.