Open wctmanager opened 1 year ago
What version of ray and raydp you are using?
I tried to run it with docker image built from ray:2.4.0 (currently latest) and 2.2.0 both for py38. raydp used in the image is as in the current Dockerfile file - latest which is currently 1.5.0. Same versions were used on the client side. Thank you for your help.
It looks like that raydp v1.5.0 is based on ray 2.1.0 (in core/raydp-main/pom.xml), so I tried to build an image with ray:2.1.0. Then raydp.init_spark works.
Created docker image as described at https://github.com/oap-project/raydp/tree/master/docker the only change is it's based on rayproject/ray:latest-py38 (on py38 and not the default py37). Created image was deployed with helm charts described https://docs.ray.io/en/latest/cluster/kubernetes/getting-started.html#kuberay-quickstart. I use Azure Kubernetes Service (AKS) and access my k8s cluster there remotely.
Then import ray import raydp ray.init("ray://x.x.x.x:10001") goes fine and connects to the ray cluster but spark = raydp.init_spark(app_name='RayDP Example', num_executors=1, executor_cores=1, executor_memory='1G') creates Traceback (most recent call last): File "python/ray/_raylet.pyx", line 870, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 921, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 877, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 881, in ray._raylet.execute_task File "python/ray/_raylet.pyx", line 821, in ray._raylet.execute_task.function_executor File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/_private/function_manager.py", line 670, in actor_method_executor return method(__ray_actor, *args, *kwargs) File "/home/ray/anaconda3/lib/python3.8/site-packages/ray/util/tracing/tracing_helper.py", line 460, in _resume_span return method(self, _args, **_kwargs) File "/opt/conda/lib/python3.8/site-packages/ray/util/tracing/tracing_helper.py", line 460, in _resume_span File "/opt/conda/lib/python3.8/site-packages/raydp/spark/ray_cluster_master.py", line 56, in start_up File "/home/ray/anaconda3/lib/python3.8/site-packages/py4j/java_gateway.py", line 1321, in call return_value = get_return_value( File "/home/ray/anaconda3/lib/python3.8/site-packages/py4j/protocol.py", line 326, in get_return_value raise Py4JJavaError( py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.deploy.raydp.RayAppMaster.setProperties. : java.lang.NullPointerException at java.util.Hashtable.put(Hashtable.java:460) at java.util.Properties.setProperty(Properties.java:166) at java.lang.System.setProperty(System.java:812) at org.apache.spark.deploy.raydp.RayAppMaster$.$anonfun$setProperties$1(RayAppMaster.scala:336) at scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:400) at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:728) at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:728) at scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:728) at org.apache.spark.deploy.raydp.RayAppMaster$.setProperties(RayAppMaster.scala:335) at org.apache.spark.deploy.raydp.RayAppMaster.setProperties(RayAppMaster.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) at py4j.Gateway.invoke(Gateway.java:282) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:238) at java.lang.Thread.run(Thread.java:750)
Any ideas? Thank you very much.