Closed GaganSD closed 3 years ago
Thanks for your question, @GaganSD .
I tested your script in pyspark shell, by using bin/pyspark --packages com.linkedin.sparktfrecord:spark-tfrecord_2.12:0.3.2
Then copied your code into the pyspark REPL. It worked for me.
I don't know why you were seeing the error. My guess is that it has something to do your environmental settings.
Right, I managed to find the bug, and it was indeed an issue with the environment settings.
We were using Spark 3.0.1 and not Spark 3.0(.0) (unlike what I mentioned above). I think this shows that this library isn't compatible with the newer releases of Spark.
Upon searching online, I found out this also happens when pyspark version doesn't match their installed spark version, or when Java's version is 9.0+ as Spark doesn't seem work well with the newer editions of java. Hope this helps someone if they run into this error.
Thanks for the reply! :)
Congrats
@GaganSD Glad you found the root cause. In fact I tested with spark-3.1.2 and it worked for me.
@junshi15 well I tested with spark-3.0.1, and I got the same error message with @GaganSD . It seems that it's indeed an issue with the environment settings.
Hi @junshi15
I have a problem same as #15. I'm trying to replicate the test examples shown in the README but I'm unable to do it because of this error. I'm using Scala 2.12 and Spark 3.0 with version 0.3.2 of spark-tfrecord.
With this installed, spark can write TFRecords as expected however can't read the same tf-records it created.
I get this error message:
Do you know a fix to this?
I'm using Python 3.7. Here's my code (It's mostly from this repo's README)
I get this error message:
Let me know if you know a fix to this, thanks! :)