timotta / xclimf

Python implementation of xCLiMF: Optimizing Expected Reciprocal Rank for Data with Multiple Levels of Relevance
https://tiago.rio.br
11 stars 2 forks source link

py2java cannot convert list into Seq #1

Open haow85 opened 4 years ago

haow85 commented 4 years ago

I was running the xclimf_spark code, but I got the following error :

Traceback (most recent call last): File "/home/hadoop/CLiMF/xclimf/xclimf_spark.py", line 98, in <module> main() File "/home/hadoop/CLiMF/xclimf/xclimf_spark.py", line 86, in main (U, V) = alg(train, opts) File "/home/hadoop/CLiMF/xclimf/xclimf_spark.py", line 35, in alg model = x.fit(_py2java(sc, ratings)) File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__ File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in deco File "/usr/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling o202.fit. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 11 in stage 12.0 failed 4 times, most recent failure: Lost task 11.3 in stage 12.0 (TID 1628, ip-10-13-250-104.us-west-2.compute.internal, executor 25): java.lang.ClassCastException: cannot assign instance of scala.collection.immutable.List$SerializationProxy to field org.apache.spark.rdd.RDD.org$apache$spark$rdd$RDD$$dependencies_ of type scala.collection.Seq in instance of org.apache.spark.rdd.MapPartitionsRDD

Looks like py2java failed to convert python list into Scala Seq. How to fix this ?

timotta commented 3 years ago

Sorry, I haven't notice the issue before. Have you workarounded the problem?

Maybe the problem is the scala or spark version, this repo is very old and i'm not using it anymore, but if you want to contribute fixing the problem I'm going to accept.