apache / pulsar-client-python

Apache Pulsar Python client library
https://pulsar.apache.org/
Apache License 2.0
51 stars 40 forks source link

Reader.has_message_available() returns True when no more messages are available on a compacted topic with tombstone messages #221

Open telnoratti opened 2 weeks ago

telnoratti commented 2 weeks ago

When reading a compacted topic with is_read_compacted=True has_message_available returns true even after all messages have been read when there is a tombstone message. This might be related to #199. As soon as any messages are added after compaction it behaves correctly again.

Python 3.12.4 pulsar-client 3.5.0

url = "localhost"
topic = "tenant/namespace/topic"
client = pulsar.Client(pulsar_url)

producer = client.create_producer(topic)
producer.send(
    b'message',
    partition_key="1",
)
# tombstone message
producer.send(
    b'',
    partition_key="1",
)
# If there is only the tombstone message, then it returns False, presumably because the compacted topic is empty
producer.send(
    b'message2',
    partition_key="2",
)

# compact the topic here with "bin/pulsar-admin topics compact tenant/namespace/topic"

reader = client.create_reader(
    topic=topic,
    start_message_id=pulsar.MessageId.earliest,
    is_read_compacted=True,
)

while reader.has_message_available():
    msg = reader.read_next()
    print(msg.data())

Once another message has been sent, the reader operates correctly again.

producer.send(
    b'message3',
    partition_key="3",
)
while reader.has_message_available():
    msg = reader.read_next()
    print(msg.data())