confluentinc / librdkafka

The Apache Kafka C/C++ library
Other
214 stars 3.14k forks source link

Reading committed offsets where metadata contains null byte leads to reading random data after null byte. #4649

Open mlowicki opened 6 months ago

mlowicki commented 6 months ago

Description

I was reproducing with code in Rust using rust-rdkafka but that wrapper doesn't do much while reading committed offsets besides calling librdkafka itself. Because in rust-rdkafka it checks that metadata is a valid UTF-8 string it panics with errors like:

Metadata is not UTF-8: Utf8Error { valid_up_to: 3, error_len: Some(1) }

if it starts to return "random" data.

Verified also by implementing OffsetFetch and OffsetCommit in Rust that it's not an issue on the Kafka side - with pure Rust impl I couldn't reproduce issue with reading invalid data.

How to reproduce

Use byte array [10, 20, 0, 30, 40] as the commit metadata and commit for any partition. Then read committed offsets via rd_kafka_committed and in some cases metadata after \0 is just different than what was written.

Examples from other tests I've conducted where for the same metadata written we get random responses:

  4 |   0 |  66 |  32 |  64 |  32 |   2 |  16 |  82 | 108 |  25 |  74 | 120 |  24 |  52 |  20 |  58 |  28 |  76 |  22 |  51 |  25 |  82 |  99 |  47 |  91 |  12 |  22 | 115 |  20 | 116 | 100 |  50 |  89 |  76 |  23 |  43 |  49 | 104 |  34 |   0 |   0 | 

  4 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |  90 |  84 |  85 |  77 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 | 160 |  32 |   0 |   0 |   0 |   0 |   0 |   0 |  90 |  84 |  85 |  77 |   0 |   0 | 
  4 |   0 |  66 |  32 |  64 |  32 |   2 |  16 |  93 | 108 |  25 |  74 | 120 |  24 |  52 |  20 |  58 |  28 |  76 |  22 |  51 |  25 |  82 |  99 |  47 |  91 |  12 |  22 | 115 |  20 | 116 | 100 |  50 |  89 |  76 |  23 |  43 |  49 | 104 |  34 |   0 |   0 | 

  4 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 | 120 | 200 | 240 |  78 |   1 |   0 |   0 |   0 |  15 |   0 |   0 |  64 |   0 |   0 |   0 |   0 |   0 |   0 |
  4 |   0 |  66 |  32 |  64 |  32 |   2 |  16 | 106 |  44 |  25 |  74 | 120 |  24 |  52 |  20 |  58 |  28 |  76 |  22 |  51 |  25 |  82 |  99 |  47 |  91 |  12 |  22 | 115 |  20 | 116 | 100 |  50 |  89 |  76 |  23 |  43 |  49 | 104 |  34 |   0 |   0 | 

  4 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 |   0 | 160 |  62 |  84 | 232 |   1 |   0 |   0 |   0 | 239 | 179 | 223 | 191 | 254 | 255 | 255 | 255 |  17 |  76 |  32 |  64 |   1 |   0 |   0 |   0 |   3 |  25 |

rust-rdkafka used librdkafka 2.3.0 - https://github.com/fede1024/rust-rdkafka/commit/87105bcb44e37bb35bf3eabdf92ef73b9a3d2c18.

Checklist

IMPORTANT: We will close issues where the checklist has not been completed.

Please provide the following information:

this is all I set:

config: ClientConfig {
    conf_map: {
        "bootstrap.servers": "XXX",
        "group.id": "bar",
    },
    log_level: Error,
}

Nothing is logged and everything seems to be working just fine.

Can't do it but no errors / warning on the broker side. Also as said above I've configured it isn't the issue purely on the Kafka side.

emasab commented 4 months ago

It happens because if _GNU_SOURCEis defined, it's using strndup here https://github.com/confluentinc/librdkafka/blob/2587cac70f83fced42c51f921bed325a434f5bc7/src/rdkafka_request.c#L1236

that stops at first NULL byte differently from the alternative implementation librdkafka provides. Needs to be fixed