AbsaOSS / hyperdrive

Extensible streaming ingestion pipeline on top of Apache Spark
Apache License 2.0
44 stars 13 forks source link

Improve logging for Deduplicator #212

Closed kevinwallimann closed 3 years ago

kevinwallimann commented 3 years ago

Logging should be improved for the Deduplicator. Currently, a lot of log output is produced in KafkaUtil.offsetsHaveBeenReached because it's part of a loop. This should be avoided. Instead, information about the current offsets in the source topic and the consumed offsets in the sink topic should be logged