redpanda-data / redpanda

Redpanda is a streaming data platform for developers. Kafka API compatible. 10x faster. No ZooKeeper. No JVM!
https://redpanda.com
9.46k stars 579 forks source link

Option to add producer info into header #23493

Open ramazanpolat opened 2 days ago

ramazanpolat commented 2 days ago

Who is this for and what problem do they have today?

We have a logs topic that aggregates log messages from hundreds of clients, all producing log messages in JSON format. While it is impractical to create a separate topic for each producer—especially considering that each has its own credentials (limited to write access to the topic)—this approach presents a significant challenge for our consumers.

Currently, consumers cannot reliably determine which producer a message belongs to. Although producers can embed their client ID, hostname, or other identifiers within the message, there is still the risk of unintentional or intentional impersonation, as producers can alter their client ID. This ambiguity complicates the process of tracing messages back to their source and undermines the integrity of our logging system.

What are the success criteria?

The ability to include the producer's username in the log messages would greatly enhance our ability to distinguish between different producers. Moreover, incorporating additional producer-related information would further improve clarity and reliability.

Why is solving this problem impactful?

Redpanda serves as an all-in-one solution, integrating features such as Zookeeper, schema registry, transformations, and more. However, to effectively differentiate producers in messages, we currently require an additional layer between the producer and Redpanda to validate credentials and insert relevant producer information into each message. This added complexity is not feasible within the Kafka protocol itself; thus, we resorted to a REST-based approach.

Unfortunately, this workaround necessitates that every producer sends messages to a REST endpoint, which introduces new challenges and inefficiencies. By implementing a feature in Redpanda that allows for the inclusion of producer information directly in the messages, we can streamline our workflow and enhance message traceability without requiring cumbersome intermediary layers.