snowplow / snowbridge

For replicating streams across clouds, accounts and regions
Other
15 stars 7 forks source link

Make setting of EventHub Partition Key configurable #148

Closed colmsnowplow closed 2 years ago

colmsnowplow commented 2 years ago

Current behaviour of eventHub target is to set the Partition Key to whatever the message partition key is.

In instrumenting unit testing for this package, I discovered that this actually impedes the EventHub client's batching behaviour, which appears to be as follows:

So, our current instrumentation creates a scenario where default behaviour is that we only have single event batches - because we always explicitly set the partition key.

If the user is setting the partition key via transformation, we will batch appropriately.

I propose we introduce an option to not set the PK in the EH event.

That leaves the question of default behaviour. The options are:

  1. Set the partition key by default, and document that it must be switched off when batching for the EventHubs target
  2. Don't set the partition key by default, and document that it must be switched on when setting partition key at transformation.

Since option 1 keeps the nuance contained within the EventHubs target's configuration docs, I prefer that. Open to alternative viewpoints.