snowplow / stream-collector

Collector for cloud-native web, mobile and event analytics, running on AWS and GCP
http://snowplowanalytics.com
Other
27 stars 32 forks source link

Add ability to configure Kafka producers individually #357

Open jbeemster opened 1 year ago

jbeemster commented 1 year ago

Currently we only allow a single producerConf for both Kafka Producers for the raw and bad output streams. This means we end up dynamically inferring things like client.id which are then set to producer-1 and producer-2 (and which renders the Stream Lineage view in Confluent Cloud effectively useless.

It also means permissions for multiple topics need to be shared rather than being individual.

All other Kafka applications enforce segmented configuration for each producer + consumer and the Collector should follow suit.

benjben commented 1 year ago

We had a discussion about this when starting to work on the refactoring of the collector.

The question was : should we take the opportunity to align the configuration of the collector with the other apps, and have :

input {
}
output {
  good {
  }
  bad {
  }
}

It was decided to keep the existing format for now, so that we don't need to worry about changing Terraform yet (both for OSS and our infra).

Once we have made the switch to http4s collector we can see to make these breaking changes and unify the configuration format of the collector with the other apps, which will make possible to configure good and bad streams differently.

/cc @stanch