aklivity / zilla-docs

0 stars 10 forks source link

There is no documentation on how messages are distributed among clients listening on gRPC streams #206

Closed hedhyw closed 6 months ago

hedhyw commented 6 months ago

Kafka consumer groups for gRPC were supported in the issue: https://github.com/aklivity/zilla/issues/597.

How does it work? I couldn't find anything in the documentation.

I want to understand how gRPC clients can be scaled so that messages are distributed between them.

Thank you!

vordimous commented 6 months ago

Hi @hedhyw Thank you for pointing this out I will add more explanation in the docs as soon as I can.

Are you trying to fan out messages from Kafka to your gRPC services, produce messages into Kafka through a gRPC method, or just curious in general how it works?

I am trying to understand what component you are connecting with Zilla that needs to be scaled.

hedhyw commented 6 months ago

Thanks for the answer! I was interested in the fan-out scenario, and I'm also interested in how it works in general.

client <- gRPC stream <- Zilla <- Kafka.

Currently, clients receive all messages unless the headers include "last-message-id-bin". Is it possible to organize groups of consumer clients? So some clients distribute events among themselves?

vordimous commented 6 months ago

The short answer is the kafka-grpc binding uses consumer groups and the grpc-kafka binding doesn't.

For your case if you want to use zilla as unique consumers that can relay the messages to a specific gRPC service then you will need to need one zilla.yaml config namespace and or named binding for each different consumer using the kafka-grpc remote_server binding.

You can see an example where every message from the specified topic in the route is sent to the gRPC service where the response from that service is produced to the reply-to topic. This would create one consumer group.

From there, your specific situation would determine if you want to configure one named binding for each gRPC service in a single zilla.yaml. Or, theoretically, you could run zilla in a sidecar-type architecture where a unique zilla instance would run with each of your gRPC clients and act as the consumer. You can control who shares a consumer group ID by naming the zilla namespace and binding appropriately.

How does it work? I couldn't find anything in the documentation.

I have opened a PR to add this context into the docs for future reference. We will also likely build some more comprehensive examples in the future.

hedhyw commented 6 months ago

Thank you!

So it means, currently it's not possible to achieve it in grpc-kafka binding with fetch capability?

As in https://github.com/aklivity/zilla-examples/blob/main/grpc.kafka.fanout/zilla.yaml It is not possible to distribute messages between gRPC clients. Right?

vordimous commented 6 months ago

So it means, currently it's not possible to achieve it in grpc-kafka binding with fetch capability?

Correct, This binding allows clients to fetch the data they want using filters and routing. Zilla doesn't store any state to identify a specific client, so it couldn't determine which client had received each message.

It is not possible to distribute messages between gRPC clients. Right?

Not from one topic through zilla. You would need to use kafka-grpc to consume messages in a group that are then sent to a gRPC service.

However, you could split the messages into different topics if you have to use a gRPC client to pull the data. The grpc-kafka binding can route traffic to different topics, or you can have multiple zilla instances with the same gRPC service interface but map to different topics. It would be a bit more manual work than a consumer group, but you could scale it horizontally so that each client would get a different set of messages.

hedhyw commented 6 months ago

Thank you for very detailed response!