AbsaOSS / hyperdrive

Extensible streaming ingestion pipeline on top of Apache Spark
Apache License 2.0
44 stars 13 forks source link

All components: User-defined options should always be added at end #146

Closed kevinwallimann closed 3 years ago

kevinwallimann commented 4 years ago

User defined options should always override any options preset by the component, e.g. in KafkaStreamWriter

confluentAvroDataFrame
      .writeStream
      .options(extraOptions)
//...
      .option("topic", topic)
      .option("kafka.bootstrap.servers", brokers)
      .format("kafka")

In this case, the user has control, over both topic and brokers, but to be consistent, the call to .options with user-defined options should always at the end

kevinwallimann commented 3 years ago

https://github.com/AbsaOSS/hyperdrive/blob/v4.4.1/ingestor-default/src/main/scala/za/co/absa/hyperdrive/ingestor/implementation/transformer/deduplicate/kafka/DeduplicateKafkaSinkTransformer.scala#L164-L171

https://github.com/AbsaOSS/hyperdrive/blob/v4.4.1/ingestor-default/src/main/scala/za/co/absa/hyperdrive/ingestor/implementation/writer/parquet/ParquetStreamWriter.scala#L62-L63