confluentinc / librdkafka

The Apache Kafka C/C++ library
Other
170 stars 3.14k forks source link

Crash (null-pointer access) in rd_kafka_metadata_cache_entry_by_id_cmp() during rd_avl_insert() #4778

Open GerKr opened 1 month ago

GerKr commented 1 month ago

Description

After about 2 months of usage a crash happened. The crashdump-file shows that it happened in the function rd_kafka_metadata_cache_entry_by_id_cmp(const void _a, const void _b), where _b is 0x0000000000000000. This leads to a write access to some address 0x0000000000000088. During the call of rd_avl_insert(ravl, elm, ran) the variable ravl contains: ravl->ravl_root == 0x12eaf651500 (no nullptr) and ravl->ravl_root.ran_height == 0x61657268 (converted to text it is "hrea" from the below found "hread_0") ravl->ravl_root.ran_elm == 0x0000000000000000 (nullptr!!!)

The memory where rafl->rafl_root points to looks like following: 0f 00 00 00 00 00 00 00 00 40 89 b8 2e 01 00 00 68 72 65 61 64 5f 30 00 00 00 00 00 00 00 00 00 0f 00 00 00 00 00 00 00 00 74 e2 b6 2e 01 00 00 06 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0f 00 00 00 00 00 00 00 00 65 74 43 68 61 6e

The call-stack looks like following:

librdkafka.dll!rd_kafka_metadata_cache_entry_by_id_cmp(const void _a, const void _b) Line 700 C librdkafka.dll!rd_avl_insert_node(rd_avl_s ravl, rd_avl_node_s parent, rd_avl_node_s ran, rd_avl_node_s existing) Line 104 C librdkafka.dll!rd_kafka_metadata_cache_insert(rd_kafka_s rk, const rd_kafka_metadata_topic mtopic, const rd_kafka_metadata_topic_internal_s metadata_internal_topic, int64 now, int64 ts_expires, unsigned char include_racks, rd_kafka_metadata_broker_internal_s brokers_internal, unsigned __int64 broker_cnt) Line 380 C librdkafka.dll!rd_kafka_metadata_cache_topic_update(rd_kafka_s rk, const rd_kafka_metadata_topic mdt, const rd_kafka_metadata_topic_internal_s mdit, unsigned char propagate, unsigned char include_racks, rd_kafka_metadata_broker_internal_s brokers, unsigned __int64 broker_cnt, unsigned char only_existing) Line 508 C librdkafka.dll!rd_kafka_parse_Metadata0(rd_kafka_broker_s rkb, rd_kafka_buf_s request, rd_kafka_buf_s rkbuf, rd_kafka_metadata_internal_s mdip, rd_list_s request_topics, const char reason) Line 857 C librdkafka.dll!rd_kafka_parse_Metadata(rd_kafka_broker_s rkb, rd_kafka_buf_s request, rd_kafka_buf_s rkbuf, rd_kafka_metadata_internal_s mdip) Line 1113 C librdkafka.dll!rd_kafka_handle_Metadata(rd_kafka_s rk, rd_kafka_broker_s rkb, rd_kafka_resp_err_t err, rd_kafka_buf_s rkbuf, rd_kafka_buf_s request, void opaque) Line 2490 C librdkafka.dll!rd_kafka_buf_callback(rd_kafka_s rk, rd_kafka_broker_s rkb, rd_kafka_resp_err_t err, rd_kafka_buf_s response, rd_kafka_buf_s request) Line 512 C librdkafka.dll!rd_kafka_buf_handle_op(rd_kafka_op_s rko, rd_kafka_resp_err_t err) Line 453 C librdkafka.dll!rd_kafka_op_handle_std(rd_kafka_s rk, rd_kafka_q_s rkq, rd_kafka_op_s rko, int cb_type) Line 884 C librdkafka.dll!rd_kafka_op_handle(rd_kafka_s rk, rd_kafka_q_s rkq, rd_kafka_op_s rko, rd_kafka_q_cb_type_t cb_type, void opaque, rd_kafka_op_res_t()(rd_kafka_s , rd_kafka_q_s , rd_kafka_op_s , rd_kafka_q_cb_type_t, void ) callback) Line 916 C librdkafka.dll!rd_kafka_q_serve(rd_kafka_q_s rkq, int timeout_ms, int max_cnt, rd_kafka_q_cb_type_t cb_type, rd_kafka_op_res_t()(rd_kafka_s , rd_kafka_q_s , rd_kafka_op_s , rd_kafka_q_cb_type_t, void ) callback, void opaque) Line 581 C librdkafka.dll!rd_kafka_thread_main(void arg) Line 2138 C librdkafka.dll!_thrd_wrapper_function(void aArg) Line 589 C kernel32.dll!00007ffd36527e94() Unknown ntdll.dll!RtlUserThreadStart() Unknown

How to reproduce

I don't know how to reproduce, as within the last 2 months it did not happen. It seems to be a sporadic problem. The crashdump and the according pdb-file and sources (v2.4.0) are available. On request I can deliver the values of variables or memory dumps.

Checklist

Please provide the following information:

GerKr commented 1 month ago

For memory-dump-decoding: I use a 64bit release version of librdkafka.