Open rodmaz opened 5 years ago
Hmm, my first thought is that this is a problem on Livy's end. Could you possibly try to create an empty session directly through the Livy REST API's sessions
endpoint, using curl
or somesuch? If the same thing happens there, I think you'll have to file the bug against them.
Can you post your container logs? those are usually pretty instructive as to what's going on when things don't go as expected. Another thing to try: Can you start a pyspark shell or spark-shell locally on the instance where your Livy server is running?
Perhaps it took more than 60 seconds to create spark session / yarn application in your cluster?
sparkmagic waits for just 60 seconds by default which may not be enough on very busy clusters when YARN has to wait to start resource preemption to get resources from other resource queues
on sparkmagic side -
import sparkmagic.utils.configuration as livy_conf
livy_conf.override(livy_conf.livy_session_startup_timeout_seconds.__name__, 300)
on livy side in livy.conf add -
livy.server.yarn.app-lookup-timeout = 300s
@rodmaz -- were you able to find a solution to this?
This is a weird issue.
We are running an AWS EMR (5.20.0) cluster with Hadoop, Spark, Livy and JupyterHub. Cluster is working fine and Livy also is working fine (we can submit/query jobs w/o authentication).
However whenever we start a notebook using kernel PySpark3, the following error occurs:
However Livy does start the job on the cluster as the Livy session log shows:
Application is also running on Hadoop YARN/Spark as we can see:
Also looking inside the Docker container running JupyterHub we see no errors in the log:
Any ideas why can't JupyterHub and Sparkmagic detect the successful Spark session created? This problem makes it impossible to run Jupyter notebooks on our cluster. Thanks.