Running into below error while using emr 6-10.1. Want to understand if this was fixed by newer Hudi versions. Found a similar issue for Flink - https://github.com/apache/hudi/issues/6540
23/10/16 17:14:47 INFO S3NativeFileSystem: Opening 's3://<redacted_s3path>/pdate=2023-10-12/.16c8f46c-a073-403b-b830-553f8c44b735-0_20231013061901564.log.1_4-452-41829' for reading
23/10/16 17:15:35 INFO S3NativeFileSystem: Opening 's3://<redacted_s3path>/pdate=2023-10-12/16c8f46c-a073-403b-b830-553f8c44b735-0_0-457-44680_20231013061901564.parquet' for reading
23/10/16 17:15:35 ERROR BoundedInMemoryExecutor: error consuming records
org.apache.hudi.com.esotericsoftware.kryo.KryoException: java.lang.NullPointerException
Serialization trace:
props (org.apache.avro.Schema$FixedSchema)
schema (org.apache.avro.generic.GenericData$Fixed)
orderingVal (org.apache.hudi.common.model.DefaultHoodieRecordPayload)
data (org.apache.hudi.common.model.HoodieAvroRecord)
at org.apache.hudi.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:144) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:731) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:125) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:543) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:813) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.common.util.SerializationUtils$KryoSerializerInstance.deserialize(SerializationUtils.java:100) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.common.util.SerializationUtils.deserialize(SerializationUtils.java:74) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:210) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:203) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:199) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.common.util.collection.BitCaskDiskMap.get(BitCaskDiskMap.java:68) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.common.util.collection.ExternalSpillableMap.get(ExternalSpillableMap.java:198) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.common.util.collection.ExternalSpillableMap.get(ExternalSpillableMap.java:55) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.io.HoodieMergeHandle.write(HoodieMergeHandle.java:340) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.table.action.commit.BaseMergeHelper$UpdateHandler.consumeOneRecord(BaseMergeHelper.java:90) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.table.action.commit.BaseMergeHelper$UpdateHandler.consumeOneRecord(BaseMergeHelper.java:80) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.common.util.queue.BoundedInMemoryQueueConsumer.consume(BoundedInMemoryQueueConsumer.java:37) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at org.apache.hudi.common.util.queue.BoundedInMemoryExecutor.lambda$null$2(BoundedInMemoryExecutor.java:135) ~[hudi-spark3-bundle_2.12-0.12.2-amzn-0.jar:0.12.2-amzn-0]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_382]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_382]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_382]
at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_382]
Caused by: java.lang.NullPointerException```
Running into below error while using emr 6-10.1. Want to understand if this was fixed by newer Hudi versions. Found a similar issue for Flink - https://github.com/apache/hudi/issues/6540
Hudi options -
Environment Description
Hudi version : 0.12.2
Spark version : 3.3.1
Hive version : 3.1.3
Storage (HDFS/S3/GCS..) : S3
Stacktrace