Closed ROOBALJINDAL closed 6 months ago
@nsivabalan can you please check
@ROOBALJINDAL Is it possible to try the same on EMR so that you will get all the logs to look into this more. There is no known updates which can cause this for 0.14.0 upgrade.
@ad1happy2go need time to setup new cluster. Our aws msk kafka cluster uses kafka version=2.6.2, can you confirm is this fine or this can be an issue? Any specific supported version of kafka?
Dont think it can be kafka version related issue as job is not failing. we need to know more logs to debug this.
I have found the issue. We were using custom MssqlDebeziumSource class as debezium source and in constructor we were using HoodieStreamerMetrics
instead of HoodieIngestionMetrics
(which is introduced in hudi 14.0)
Once corrected the class, it started working. We can close this issue
Issue:
We have migrated from Hudi 0.13.0 to Hudi 0.14.0 and in this version, CDC events from Kafka upserts are not working. Table is created first time but afterwards, any new record added/updated into the sql table which pushes cdc event to kafka is not get updated in the hudi table. Is there any new configuration for hudi 0.14.0?
We are running Aws EMR serverless: 6.15. We tried to enable debug level logs by providing following classification to serverless app which modified log4j properties to print hudi package logs but this also doesnt print.
Since it is serverless we can't ssh tunnel into node and see log4j property file and couldn't get hudi logs.
Configurations:
### Spark job parameters:
### kafka-source.properties:
### Table config properties:
Environment Description
Hudi version : 0.14.0
Spark version : 3.4.1
Hive version : 3.1.3
Hadoop version :3.3.6
Storage (HDFS/S3/GCS..) : S3