sryza / spark-timeseries

A library for time series analysis on Apache Spark
Apache License 2.0
1.19k stars 424 forks source link

Py4JError: javaTimeSeriesRDDFromObservations does not exist in the JVM #208

Open connyK opened 6 years ago

connyK commented 6 years ago

Hi @sryza I try to run a similar spark-ts example as Stocks.py As you did there, I would like to make a time_series_rdd_from_observations(dtIndex, obs, "timestamp", "timestamp", "price"). Running the code yields in Exception in thread "Thread-3" java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2701) at java.lang.Class.privateGetPublicMethods(Class.java:2902) at java.lang.Class.getMethods(Class.java:1615) at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:284) at py4j.commands.ReflectionCommand.getMember(ReflectionCommand.java:140) at py4j.commands.ReflectionCommand.execute(ReflectionCommand.java:91) at py4j.GatewayConnection.run(GatewayConnection.java:214) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.ClassNotFoundException: org.apache.spark.sql.DataFrame at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 9 more ERROR:root:Exception while sending command. Traceback (most recent call last): File "/home/username/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 883, in send_command response = connection.send_command(command) File "/home/username/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1040, in send_command "Error while receiving", e, proto.ERROR_ON_RECEIVE) Py4JNetworkError: Error while receiving Traceback (most recent call last): File "/home/username/path/to/file/test.py", line 71, in <module> time_series_rdd_from_observations(dtIndex, obs, "timestamp", "timestamp", "price") File "/home/username/.local/lib/python2.7/site-packages/sparkts/timeseriesrdd.py", line 235, in time_series_rdd_from_observations jtsrdd = jvm.com.cloudera.sparkts.api.java.JavaTimeSeriesRDDFactory.javaTimeSeriesRDDFromObservations( \ File "/home/username/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1363, in __getattr__ py4j.protocol.Py4JError: com.cloudera.sparkts.api.java.JavaTimeSeriesRDDFactory.javaTimeSeriesRDDFromObservations does not exist in the JVM. I run the file via spark-submit --jars sparkts-0.3.0-jar-with-dependencies.jar test.py.

Any idea?

connyK commented 6 years ago

I build an new sparkts version. I still get the same Error: spark-submit --jars spark-timeseries-master/target/sparkts-0.4.0-SNAPSHOT-jar-with-dependencies.jar test.py Traceback (most recent call last): File "/home/username/path/to/file/test.py", line 71, in <module> time_series_rdd_from_observations(dtIndex, obs, "timestamp", "timestamp", "price") File "/home/username/.local/lib/python2.7/site-packages/sparkts/timeseriesrdd.py", line 235, in time_series_rdd_from_observations jtsrdd = jvm.com.cloudera.sparkts.api.java.JavaTimeSeriesRDDFactory.javaTimeSeriesRDDFromObservations( \ File "/home/username/spark/python/lib/py4j-0.10.4-src.zip/py4j/java_gateway.py", line 1363, in __getattr__ py4j.protocol.Py4JError: com.cloudera.sparkts.api.java.JavaTimeSeriesRDDFactory.javaTimeSeriesRDDFromObservations does not exist in the JVM Thanks

tjyiiuan commented 6 years ago

Same issue here. Do you have any solution yet? Thanks.