AbsaOSS / hyperdrive

Extensible streaming ingestion pipeline on top of Apache Spark
Apache License 2.0
44 stars 13 forks source link

Publish key from kafka source as key to kafka sink / HyperdriveContext #114

Closed kevinwallimann closed 4 years ago

kevinwallimann commented 4 years ago

Currently, the key from a kafka source is not ingested in ConfluentAvroKafkaStreamDecoder. Also, no key is produced to the kafka sink in KafkaStreamWriter. It should be possible to publish the ingested keys along with the value.

Consuming keys is supported by Abris like so

val result: DataFrame  = dataFrame.select(
    from_confluent_avro(col("key"), keyRegistryConfig) as 'key,
    from_confluent_avro(col("value"), valueRegistryConfig) as 'value)

Tasks