SeldonIO / MLServer

An inference server for your machine learning models, including support for multiple frameworks, multi-model serving and more
https://mlserver.readthedocs.io/en/latest/
Apache License 2.0
685 stars 177 forks source link

Enable consuming from a Kafka topic as a ConsumerGroup #1309

Open ichbinjakes opened 1 year ago

ichbinjakes commented 1 year ago

Hi,

I was trying to parallelize model inference off a Kafka topic with multiple server instances. I couldn't get it working until I modified MLServer to receive configuration that sets group_id on the server's Kafka consumer. This allows Kafka to assign the consumer to a partition and track the offset - I also needed to configure partitions on the topic.

Consumer docs: https://aiokafka.readthedocs.io/en/stable/api.html#consumer-class

Is there interest in adding this back into MLServer? I would be able to put in a PR, its only a small change.

Jake

adriangonz commented 1 year ago

Hey @ichbinjakes ,

Thanks for raising this one.

Just so that we can understand the scope of the changes involved, would the same group_id apply to every model running on the instance? Or would this be a per-model setting?

ichbinjakes commented 1 year ago

Hey @adriangonz

If there are two models in a mlserver instance and you want to generate predictions for each model type with all data I believe you would need different group_ids on a per-model setting. Otherwise an instance wide setting should work fine. The use case I had in mind when I submitted the issue is when there is only one model available in the server instance; I have never used multiple models.