Open haow85 opened 4 years ago
Sorry, I haven't notice the issue before. Have you workarounded the problem?
Maybe the problem is the scala or spark version, this repo is very old and i'm not using it anymore, but if you want to contribute fixing the problem I'm going to accept.
I was running the xclimf_spark code, but I got the following error :
Traceback (most recent call last): File "/home/hadoop/CLiMF/xclimf/xclimf_spark.py", line 98, in <module> main() File "/home/hadoop/CLiMF/xclimf/xclimf_spark.py", line 86, in main (U, V) = alg(train, opts) File "/home/hadoop/CLiMF/xclimf/xclimf_spark.py", line 35, in alg model = x.fit(_py2java(sc, ratings)) File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__ File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o202.fit. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 11 in stage 12.0 failed 4 times, most recent failure: Lost task 11.3 in stage 12.0 (TID 1628, ip-10-13-250-104.us-west-2.compute.internal, executor 25): java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD
Looks like py2java failed to convert python list into Scala Seq. How to fix this ?