roadrunner-server / roadrunner

🤯 High-performance PHP application server, process manager written in Go and powered with plugins
https://docs.roadrunner.dev
MIT License
7.92k stars 411 forks source link

[🧹 CHORE]: Add support to pass custom keys for kafka records #2009

Closed edefimov closed 1 month ago

edefimov commented 1 month ago

No duplicates 🥲.

What should be improved or cleaned up?

Kafka job driver uses internal job identifiers as a key for kafka message here However in kafka terms, the record key can contain any arbitrary data and is not required to be unique. Moreover kafka clients use message key for calculating target partition for the message. This fact is used by applications to guarantee that several messages having the same key would be sent to the same topic partition and processed sequentially. I suggest to add an option to provide kafka message key from the application side.

The implementation seems to be quite straightforward, since there are kafka-specific options in the driver.

rustatian commented 1 month ago

Hey @edefimov 👋 JOB ID in the PHP JOBS client is the msg.ID() you see in the Go server library. Do you have any problems passing this argument?

edefimov commented 1 month ago

Hi, @rustatian I experience problems if I want to pass some application value as the message key. For example, I have event stream of entities updates partitioned by entity identifier. It is necessary to process each entity update sequentially. So using direct kafka connection I can use the entity identifier as the kafka record key. All messages with the same key will reach the same target topic partition and processed sequentially. It would be great to have the same guarantees using roadrunner kafka driver.

rustatian commented 1 month ago

I experience problems if I want to pass some application value as the message key.

Do you mean, that you want to send a Kafka message from the PHP Worker and have problems with that?

edefimov commented 1 month ago

Yes, I can not set kafka record key from php worker: this library integrating roadrunner jobs into php does not allow to provide message id. If I bypass this limitation by using direct grpc call "jobs.Push" and provide my own ID, my usecase will break roadrunner itself: roadrunner expects job id to be unique value. And in example described in comment above I need to have the same keys for different messages. It is valid usecase for kafka, but impossible to implement with roadrunner kafka driver

rustatian commented 1 month ago

RR does not expect ID to be unique. You may safely use jobs.Push RPC call with any arguments you want and the only assumption RR makes is that ID should not be empty.

rustatian commented 1 month ago

Closing due to that functionality already exists. You are welcome to continue discussion here, since the ticket is not locked.