linkedin / spark-tfrecord

Read and write Tensorflow TFRecord data from Apache Spark.
BSD 2-Clause "Simplified" License
291 stars 57 forks source link

NullPointerException at org.apache.hadoop.conf.Configuration #69

Open Yingminzhou opened 11 months ago

Yingminzhou commented 11 months ago

hi, my scala version is 2.12.15 , spark version is 3.0, I use spark-tfrecord_2.12:0.40, but met an error:

Caused by: java.lang.NullPointerException at org.apache.hadoop.conf.Configuration.(Configuration.java:821) at org.apache.hadoop.mapred.JobConf.(JobConf.java:440) at org.apache.hadoop.mapreduce.task.JobContextImpl.(JobContextImpl.java:67) at org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl.(TaskAttemptContextImpl.java:49) at org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl.(TaskAttemptContextImpl.java:44) at com.linkedin.spark.datasources.tfrecord.TFRecordFileReader$.readFile(TFRecordFileReader.scala:32) at com.linkedin.spark.datasources.tfrecord.DefaultSource.$anonfun$buildReader$1(DefaultSource.scala:132)

I tested the following case, but it runs well:

Do you have any ideas about this error? I am really confused. lol.

junshi15 commented 11 months ago

It is hard to say where the problem is with the limited information provided here. If you can run the examples in the README file, then your setup is likely correct. Then the problem might be in the TFRecord files.