redis / hiredis

Minimalistic C client for Redis >= 1.2
BSD 3-Clause "New" or "Revised" License
6.22k stars 1.81k forks source link

Application(redis client) crashes with SIGSEGV at redisBufferRead at hiredis.c:941 while making a GET call to redis server #1267

Closed Sukesh0513 closed 3 months ago

Sukesh0513 commented 3 months ago

in our application we use c++, so we have used redis++(1.3.7) and hired(1.0.2) for our redis client implementation, In the application we make a GET call to redis server, then after that call, a bunch of functions in redis++ and hiredis are called one by one, and the application finally crashes with SIGSEGV while calling the function redisBufferRead at at hiredis.c:941. This is what I could see from backtracing the corefile.

Note: in our application we do several GET call until we get the desired value the from redis-server, in some cases if the value is configured in redis servees, we will have a null value as return and we will retry the GET call again untill we get a non null value. In the initial stage, redis server is not pupulated this value and our application will keep on making get call but it seals like this issue is not happening for every get call, it only happen once after multiple times.

here is the backTrace from the corefile

Program terminated with signal SIGSEGV, Segmentation fault.

0 0x00007f6dc0b49a21 in redisBufferRead (c=0x1720680) at hiredis.c:941

941 hiredis.c: No such file or directory.

0 0x00007f6dc0b49a21 in redisBufferRead (c=0x1720680) at hiredis.c:941

1 0x00007f6dc0b49d78 in redisGetReply (c=c@entry=0x1720680, reply=reply@entry=0x7f6da0989618) at hiredis.c:1032

2 0x00007f6dc0b84b66 in redis::Connection::recv (this=this@entry=0x7f6da0989688, handle_error_reply=handle_error_reply@entry=true)

at /tmp/tmp.PiajinzKq4/redis-plus-plus-1.3.7/src/redis++/connection.cpp:212

3 0x00007f6dc0b9e913 in redis::Redis::_command<void (*)(redis::Connection&, std::basic_string_view<char, std::char_traits > const&), std::basic_string_view<char, std::char_traits > const&> (this=, cmd=0x7f6dc0b9cb40 <redis::cmd::get(redis::Connection&, std::basic_string_view<char, std::char_traits > const&)>, connection=...)

at /tmp/tmp.PiajinzKq4/redis-plus-plus-1.3.7/src/redis++/redis.hpp:1316

4 redis::Redis::command<void (*)(redis::Connection&, std::basic_string_view<char, std::char_traits > const&), std::basic_string_view<char, std::char_traits > const&> (

this=<optimized out>, cmd=cmd@entry=0x7f6dc0b9cb40 <redis::cmd::get(redis::Connection&, std::basic_string_view<char, std::char_traits<char> > const&)>)
at /tmp/tmp.PiajinzKq4/redis-plus-plus-1.3.7/src/redis++/redis.hpp:47

5 0x00007f6dc0b9733c in redis::Redis::get[abi:cxx11](std::basic_string_view<char, std::char_traits > const&) (this=, key=...)

at /tmp/tmp.PiajinzKq4/redis-plus-plus-1.3.7/src/redis++/redis.cpp:314

Initially I thought that the redisContext which is used in the function redisBufferRead at hiredis.c:941 is pointing to a null pointer or might be not be initialized but when the checked the backtrack and tried to print it, I came to know that it is not null and is initialized, looking the back trace I do not understand or can find any reson for redisBufferRead to throw a SIGSEGV. Any idea or understand on why this could happen will be very helpful.

michael-grunder commented 3 months ago

What I would suggest is running your application under valgrind or building it with gcc or clang sanitizers.

Immediately the address 0x1720680 looks wrong compared to the other addresses in the callstack. This could indicate that you have a buffer overrun or similar issue that's overwriting the pointer to your redisContext object.

I can't tell for sure, but you can see that every other address looks more like 0x7f6da0989618, which seems like a valid address.

Valgrind can probably find this for you very quickly whether it is an issue in your program or possibly in hiredis or redis++

michael-grunder commented 3 months ago

Going to close this because it's unlikely a bug in hiredis.

However, if you run your program under valgrind and it does indicate hiredis as the source of the bug feel free to post the diagnostics and reopen.

Sukesh0513 commented 3 months ago

Hi,

Thanks for the analysis, we did sun some unit tests under valgrind but we do find this issue, in out unit test we usually add a mock value as a respace to a GET call from redis, so we do not have any issues. But this issue happens in test server, when we keep the application in idle state and do not configure any thing in redis server, so our application makes GET calls indefinitely, we do not see hredis getting this crash right away, it makes a bunch of GET call and this after some time it crashes,

The crash apparently happens at hiredis.c at 941, when it is trying to access the c

image

So I checked the core file and c has valid content in it and is not null, so the only conclusion that I have is, may be c is dereferences just before accessing it at hiredis,c 941, but may be the address still holds the data. Also, have you seen any issue of this sort where hiredis dereferences or some thing of this sort and throws a SIGSEHV after multiple retries?

michael-grunder commented 3 months ago

Also, have you seen any issue of this sort where hiredis dereferences or some thing of this sort and throws a SIGSEHV after multiple retries?

Over the years we have certainly fixed segfaulting bugs, but not in quite some time. It's certainly possible that there's another one in the codebase that we haven't detected yet.

we did sun some unit tests under valgrind but we do find this issue

The valgrind log would be interesting to see. It should be able to show the origin of the problem.

Like I said though, the address 0x1720680 for the redisContext looks suspect to me :smile: If the context is just a normal heap allocation, I would expect it to look like the rest of your addresses in the callstack:

Look at the rest of your addresses in that callstack:

0x7f6da0989618
0x7f6da0989688
0x7f6dc0b9cb40
0x00007f6dc0b49a21
0x00007f6dc0b49d78
0x00007f6dc0b84b66
0x00007f6dc0b9733c
0x00007f6dc0b9e913

Those look like heap allocations to me. 0x1720680 looks like it's in your program somewhere.

In addition to the output of valgrind another thing to try is running your tests using rr. If you can catch the crash it will allow you to replay it over and over again, meaning if this is a buffer overrun or similar you will be able to set a watchpoint on your context and run backwards until it detects the write.

(rr) watch -l <redisContext pointer>
(rr) reverse-continue
Sukesh0513 commented 1 month ago

Hi again ,

Sorry for the delay, we have reproduced the issue under valgriend and here is what I can see from the BackTrace of valgriend. this crash is not exactly same as the previous one which happens at redisBufferRead (c=0x1720680) at hiredis.c:941, but we do see a crash here as well, but it is happening at a different place, Please take a look and let me know if this rings any bells

==4595== 1,532 bytes in 1 blocks are possibly lost in loss record 1,258 of 1,290 ==4595== at 0x483D85F: realloc (vg_replace_malloc.c:1451) ==4595== by 0x4928B49: hi_realloc (alloc.h:71) ==4595== by 0x4928B49: sdsMakeRoomFor (sds.c:226) ==4595== by 0x49293AB: sdscatlen (sds.c:384) ==4595== by 0x492EA44: redisReaderFeed (read.c:724) ==4595== by 0x4927ECD: redisBufferRead (hiredis.c:985) ==4595== by 0x4928227: redisGetReply (hiredis.c:1081) ==4595== by 0x48B75B5: redis::Connection::recv(bool) (connection.cpp:241) ==4595== by 0x48D3702: _command<void ()(redis::Connection&, const std::basic_string_view&), const std::basic_string_view<char, std::char_traits >&> (redis.hpp:1316) ==4595== by 0x48D3702: std::enable_if<!std::is_convertible<void ()(redis::Connection&, std::basic_string_view<char, std::char_traits > const&), std::basic_string_view<char, std::char_traits > >::value, std::unique_ptr<redisReply, redis::ReplyDeleter> >::type redis::Redis::command<void ()(redis::Connection&, std::basic_string_view<char, std::char_traits > const&), std::basic_string_view<char, std::char_traits > const&>(void ()(redis::Connection&, std::basic_string_view<char, std::char_traits > const&), std::basic_string_view<char, std::char_traits > const&) (redis.hpp:47) ==4595== by 0x48CBF7B: redis::Redis::get[abi:cxx11](std::basic_string_view<char, std::char_traits > const&) (redis.cpp:314)

michael-grunder commented 1 month ago

This looks like partial output from valgrind where it's describing a memory leak.

You can capture the full valgrind output with --log-file=/tmp/some-log-file.txt. You can also specify --track-origins=yes which will display both the invalid read/write but also where the memory came from if valgrind can determine that.