Closed richardhundt closed 1 year ago
If I change this https://github.com/faust-streaming/faust/blob/4a234204ee26bb28472a6640ccab286b807a3681/faust/transport/drivers/aiokafka.py#L715 from
aiokafka_offsets = {
tp: OffsetAndMetadata(offset, "")
for tp, offset in offsets.items()
if tp in self.assignment()
}
self.tp_last_committed_at.update({tp: now for tp in aiokafka_offsets})
await consumer.commit(aiokafka_offsets)
to this:
aiokafka_offsets = {
TopicPartition(tp.topic, tp.partition): OffsetAndMetadata(offset, "")
for tp, offset in offsets.items()
if tp in self.assignment()
}
self.tp_last_committed_at.update({TP(tp.topic, tp.partition): now for tp in aiokafka_offsets})
await consumer.commit(aiokafka_offsets)
it works.
Interesting, let's get a PR opened with these changes and talk about this further there. I've never seen this issue before, so I'm curious to know what code you're running that triggered this.
@richardhundt as far as I can see a consumer has the function _new_topicpartition
which should avoid exactly this for aiokafka drivers https://github.com/faust-streaming/faust/blob/87a80a968f73220d5ac6190fb7df70b85427bdae/faust/transport/drivers/aiokafka.py#L258
But it seems to be not used anywhere 🤔
@dabdada also, when it fetches the assignment
, it makes sure they're not the aiokafka ones:
I do suspect this to be simply for internal type usage.
Can you provide a minimal example of code that raises the error? As well as @wbarnha I didn't experience anything like this and now wonder why 😁
@dabdada Are you saying that _commit
is only for internal use?
Because, you can just inspect it, it's not hard to see that it calls assignment
and assignment
calls ensure_TPset
, which gives you a set of faust's TP
types and there's definitely an isinstance
assertion in aiokafka
so you simply cannot pass anything other than aiokafka's TopicPartition
types.
Surely seeing that _commit
is broken doesn't require anything other than looking at it.
The real question is why has this ever worked? Does _commit
not get called unless we're in some strange code path, because we're usually relying on Kafka's autocommit?
EDIT: also the stack trace proves my point ;) It actually logs the dictionary which causes the error:
Current assignment: {TP(topic='test-uploads', partition=0), TP(topic='test-dgtal-cases-documents-list-changelog', partition=0), TP(topic='test-dgtal.worker.agents.health_agent', partition=0), TP(topic='test-transactions', partition=0), TP(topic='dgtal
Those are TP
instances, clearly not TopicPartition
instances so you'd expect passing them to aiokafka
to fail.
I've added comments inline to show what's going on:
# we're about to build a `Dict[TP, OffsetAndMetadata]`...
aiokafka_offsets = {
tp: OffsetAndMetadata(offset, "")
for tp, offset in offsets.items() # offsets is `Mapping[TP, int]`
# if `assignment` returns `TP` instances (which it does)
# then this builds a `Dict[TP, OffsetAndMetadata]`
if tp in self.assignment()
}
# the following is okay, we can work with `TP` instances
self.tp_last_committed_at.update({tp: now for tp in aiokafka_offsets})
# however the following calls into `aiokafka` and passes the *same* `Dict[TP, OffsetAndMetadata]`.
# It cannot possibly be correct because aiokafka wants `TopicPartition` instances!
await consumer.commit(aiokafka_offsets) # BOOM!
EDIT: I'm now thinking that there are cases where the offsets: Mapping[TP, int]
parameter is a Mapping[TopicPartition, int]
, but that doesn't explain why the if tp in self.assignment()
filter works because equality should fail if the tp
s are different types.
I think I finally figured it out after digging deeper into faust-streaming.
First of all yes _commit is an internal function according to pythonic convention (prefixed with underscore). You should be careful when using it, although in that case it's documentation here is not optimal and the type hints suggest that faust TP is used here.
Secondly, to answer the question why the in self.assignemnt()
filter works is simply because faust.types.tuples.TP
and kafka.structs.TopicPartition
are NamedTuples with the same attributes, so their equality check magic function compares them according to the attributes and those are the same.
Now for the initial question, how this all did work anyways:
kafka.structs.TopicPartition
infocommit
method of the consumer.commit
method hands those to aiokafka and we have a full lifecycle of the event / message completedThis means the TopicPartition is never really changed to an faust internal TP when using the aiokafka driver, It's simply typed like one.
Why do you want to call commit
directly anyways? Cant you reduce the commit interval if you want to have more frequent commits?
Hope this explanation helps to understand the inner workings. And also resolves your question. This is actually not a bug just somewhat bad type hinting and weak ensuring of the correct classes transferred to aiokafka. Plus no docs about this.
Edit 1: Actually this is bad wording, it is not badly typed, its the only way we get a coherent way of typing internally.
That's the thing, I'm not calling commit
or _commit
manually. Some internal machinery is doing it.
That means that something is passing offsets
of the type actually shown in the method signature.
Ah okay sorry for this misunderstanding. Looks like if you start in client mode the on_message callback provides Faust TPs. You still didn't share your config how this happens so it's hard to say really.
https://github.com/faust-streaming/faust/blob/master/faust/transport/conductor.py#L268
Either way it might be a low hanging fruit to ensure aiokafka topic partitions is thrown into the aiokafka lib.
I'm trying to create a minimal example, but I'm running a 32 hour processing job at the moment, so I don't want to interrupt it. I'll see what I can dig up.
Aren't the commits a no-op in client-only mode because the consumer doesn't have a consumer group?
I guess client only mode would not reach this commit method here as it is intended as dev mode that does not require a kafka but has a simple reply_consumer.
Edit: so yes what you said.
I found a puzzling issue.
Both aiokafka and faust define named tuples for
(topic, partition)
, namelyTopicPartition
andTP
respectively. The faust one doesn't satisfy theisinstance
check in aiokafka'scommit_structure_validate
and so the stack trace below occurs. What I don't understand is how this ever worked. Why do I only see this now? Is this some unusual commit path?Either way, a faust
TP
is not a aiokafkaTopicPartition
according toisinstance
, so this must be a bug.