Open nitinware opened 2 years ago
The error message is very clear. recordType can be Example or SequenceExample
Instead of .option("recordType", "tfrecords")
, you should use .option("recordType", "Example")` or SequenceExample.
Please take a look at the README file. https://github.com/linkedin/spark-tfrecord#features
thanks for quick response seeing below error now, appreciate ur inputs, thanks -
java.lang.ClassCastException: com.linkedin.spark.shaded.org.tensorflow.example.FeatureList cannot be cast to com.linkedin.spark.shaded.org.tensorflow.example.Feature
at com.linkedin.spark.datasources.tfrecord.TFRecordSerializer.$anonfun$serializeExample$1(TFRecordSerializer.scala:22)
at com.linkedin.spark.datasources.tfrecord.TFRecordSerializer.$anonfun$serializeExample$1$adapted(TFRecordSerializer.scala:19)
at scala.collection.immutable.Range.foreach(Range.scala:158)
at com.linkedin.spark.datasources.tfrecord.TFRecordSerializer.serializeExample(TFRecordSerializer.scala:19)
at com.linkedin.spark.datasources.tfrecord.TFRecordOutputWriter.write(TFRecordOutputWriter.scala:29)
at org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.write(FileFormatDataWriter.scala:140)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$1(FileFormatWriter.scala:278)
at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1473)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:286)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$15(FileFormatWriter.scala:210)
I am guessing your data is "SequenceExample", but you try to write it as "Example".
I am trying to write a spark df to 'tfrecord'
df.write.mode("overwrite").format("tfrecord").option("recordType", "tfrecords").save(outputPath + '/tf-records/')
I am running on gcp dataproc cluster which comes with spark version '3.1.2' and I am using spark-tfrecord jar - 'spark-tfrecord_2.12-0.3.4.jar'Seeing below error on write operation -
Appreciate your inputs on this issue, Thanks.