dlt-hub / verified-sources

Contribute to dlt verified sources 🔥
https://dlthub.com/docs/walkthroughs/add-a-verified-source
Apache License 2.0
50 stars 40 forks source link

kafka: don't subscribe after consumer has been assigned #451

Closed aksestok closed 1 month ago

aksestok commented 1 month ago

Tell us what you do here

Short description

See #450.

Don't subscribe after assigning partitions and offsets.

Related Issues

Additional Context

AstrakhantsevaAA commented 1 month ago

@IlyaFaer take a look please

IlyaFaer commented 1 month ago

@aksestok, @rudolfix, after all, you may be right. For all cases, Kafka returns offset -1001. That's a nasty constant, which is supposed to be an invalid offset.

Though it's marked invalid, Kafka actually accepts our offsets. But! In case subscribe() called, Kafka reverts offsets we just assigned. And even if I explicitly commit offsets after assign(), subscribe() still reverts to the previous offset (though -1001 is supposed to use the last committed offset, and we just committed it).

Seems to me the problem is in these -1001 (which comes from backend - in code we have normal digits). All the tests are passing without subscribe() (but they do not pass anymore with subscribe() 🫤, I have two failures), so I think we can merge it.

Untitled

IlyaFaer commented 1 month ago

The craziest is that before and after subscribe(), Kafka returns the same assignments and positions. Though under the hood they are not.

image

rudolfix commented 1 month ago

All the tests are passing without subscribe() (but they do not pass anymore with subscribe() 🫤, I have two failures)

@IlyaFaer do we miss an essential test? extract the messages, drop the pipeline (to simulate state rollback) and run it again making sure the same messages are back?

IlyaFaer commented 1 month ago

@rudolfix, done:

https://github.com/dlt-hub/verified-sources/pull/467