jupyter-incubator / sparkmagic

Jupyter magics and kernels for working with remote Spark clusters
Other
1.33k stars 447 forks source link

Code not running #384

Closed yugam1 closed 7 years ago

yugam1 commented 7 years ago

Error appears anytime I run any code in any of kernels image

aggFTW commented 7 years ago

It seems like your configuration for a livy endpoint is not valid. What does your config.json file look like?

yugam1 commented 7 years ago

It is like this: { "kernel_python_credentials" : { "username": "", "password": "", "url": "http://localhost:8998", "auth": "None" },

"kernel_scala_credentials" : { "username": "", "password": "", "url": "http://localhost:8998", "auth": "None" }, "kernel_r_credentials": { "username": "", "password": "", "url": "http://localhost:8998" },

"logging_config": { "version": 1, "formatters": { "magicsFormatter": { "format": "%(asctime)s\t%(levelname)s\t%(message)s", "datefmt": "" } }, "handlers": { "magicsHandler": { "class": "hdijupyterutils.filehandler.MagicsFileHandler", "formatter": "magicsFormatter", "home_path": "~/.sparkmagic" } }, "loggers": { "magicsLogger": { "handlers": ["magicsHandler"], "level": "DEBUG", "propagate": 0 } } },

"wait_for_idle_timeout_seconds": 15, "livy_session_startup_timeout_seconds": 60,

"fatal_error_suggestion": "The code failed because of a fatal error:\n\t{}.\n\nSome things to try:\na) Make sure Spark has enough available resources for Jupyter to create a Spark context.\nb) Contact your Jupyter administrator to make sure the Spark magics library is configured correctly.\nc) Restart the kernel.",

"ignore_ssl_errors": false,

"session_configs": { "driverMemory": "1000M", "executorCores": 2 },

"use_auto_viz": true, "coerce_dataframe": true, "max_results_sql": 2500, "pyspark_dataframe_encoding": "utf-8",

"heartbeat_refresh_seconds": 30, "livy_server_heartbeat_timeout_seconds": 0, "heartbeat_retry_seconds": 10,

"server_extension_default_kernel_name": "pysparkkernel", "custom_headers": {},

"retry_policy": "configurable", "retry_seconds_to_sleep_list": [0.2, 0.5, 1, 3, 5], "configurable_retry_policy_max_retries": 8 }

aggFTW commented 7 years ago

And you have livy running on http://localhost:8998/? Can you get a sessions response without any authentication if you go to http://localhost:8998/sessions?

jmeidam commented 7 years ago

I am also curious to find the answer to this issue. I am experiencing the same error and I have the same configuration file.

I can go to http://localhost:8998/sessions without authentication. I see a JSON tab opened which has the following info: {"from":0,"total":0,"sessions":[]}

So no sessions, that might be bad. I can also add that I did not change anything in the docker files or config files, everything is as it is in the github repository (yesterday). Furthermore, when doing docker-compose up, I see the following output, that may have some clues:


Creating sparkmagicclone_spark_1 ...
Creating sparkmagicclone_spark_1 ... done
Creating sparkmagicclone_jupyter_1 ...
Creating sparkmagicclone_jupyter_1 ... done
Attaching to sparkmagicclone_spark_1, sparkmagicclone_jupyter_1
jupyter_1  | [W 08:10:55.921 NotebookApp] Unrecognized JSON config file version, assuming version 1
jupyter_1  | [I 08:10:55.935 NotebookApp] Writing notebook server cookie secret to /home/jovyan/.local/share/jupyter/runtime/notebook_cookie_secret
jupyter_1  | [W 08:10:55.972 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended.
jupyter_1  | [W 08:10:55.973 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using authentication. This is highly insecure and not recommended.
jupyter_1  | [I 08:10:56.340 NotebookApp] sparkmagic extension enabled!
jupyter_1  | [I 08:10:56.343 NotebookApp] Serving notebooks from local directory: /home/jovyan/work
jupyter_1  | [I 08:10:56.343 NotebookApp] 0 active kernels
jupyter_1  | [I 08:10:56.343 NotebookApp] The Jupyter Notebook is running at: http://[all ip addresses on your system]:8888/
jupyter_1  | [I 08:10:56.343 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
spark_1    | 17/07/18 08:10:57 INFO StateStore$: Using BlackholeStateStore for recovery.
spark_1    | 17/07/18 08:10:57 INFO BatchSessionManager: Recovered 0 batch sessions. Next session id: 0
spark_1    | 17/07/18 08:10:57 INFO InteractiveSessionManager: Recovered 0 interactive sessions. Next session id: 0
spark_1    | 17/07/18 08:10:57 INFO InteractiveSessionManager: Heartbeat watchdog thread started.
spark_1    | 17/07/18 08:10:57 INFO WebServer: Starting server on http://spark:8998
aggFTW commented 7 years ago

:S? Docker images should work. Those logs and that payload look fine. No sessions are expected, as sessions should be created by the kernel.

Building and running the Docker images from master, I get this:

image

Can you run %%info on the kernels and post the result?

jmeidam commented 7 years ago

When running that in either of the pyspark kernels I get:

Current session configs: {'kind': 'pyspark3'}
An internal error was encountered.
Please file an issue at https://github.com/jupyter-incubator/sparkmagic
Error:
'>=' not supported between instances of 'NoneType' and 'int'

There is one thing that I did different from the installation procedure described in the readme. In stead of first doing docker-compose build, I just did docker-compose up right away. It did seem to install everything, but could it be that some part of the setup is missing because I skipped that step?

aggFTW commented 7 years ago

Please make sure you have the latest code and that you follow the instructions. It seems like you are hitting an exception that was fixed in the latest code. What do you get if you run:

%%bash
pip show sparkmagic

Do you get version 0.12.2?

What happens if you execute:

%%bash
cat ~/.sparkmagic/config.json
jmeidam commented 7 years ago

Oh, I seem to have gotten an older version:

Name: sparkmagic
Version: 0.11.2
Summary: SparkMagic: Spark execution via Livy
Home-page: https://github.com/jupyter-incubator/sparkmagic/sparkmagic
Author: Jupyter Development Team
Author-email: jupyter@googlegroups.org
License: BSD 3-clause
Location: /Users/jmeidam/anaconda3/lib/python3.6/site-packages
Requires: hdijupyterutils, autovizwidget, mock, nose, requests, pandas, ipython, ipykernel, ipywidgets, tornado, numpy, notebook

I'm not sure how that happened, I cloned the git repo this Monday I think. I'll try pulling and rebuilding. I'll let you know what happens.

jmeidam commented 7 years ago

Well, odd things were happening. I did a git pull, which said everything was up to date. I also noticed in the above output that sparkmagic was referring to an installation outside of the container. It may indeed have had something to do with me skipping the docker-compose build step. I removed everything, did a new git clone, then docker-compose build and docker-compose up. Now it works.

Thanks very much for your help.

These are outputs of the various commands suggested in this thread:

%%info
Current session configs: {'driverMemory': '1000M', 'kind': 'pyspark3', 'executorCores': 2}
No active sessions.
%%bash
pip show sparkmagic

---
Metadata-Version: 2.0
Name: sparkmagic
Version: 0.12.1
Summary: SparkMagic: Spark execution via Livy
Home-page: https://github.com/jupyter-incubator/sparkmagic/sparkmagic
Author: Jupyter Development Team
Author-email: jupyter@googlegroups.org
Installer: pip
License: BSD 3-clause
Location: /opt/conda/lib/python3.5/site-packages
Requires: notebook, autovizwidget, pandas, hdijupyterutils, requests, ipykernel, ipython, nose, mock, numpy, tornado, requests-kerberos, ipywidgets
Classifiers:
  Development Status :: 4 - Beta
  Environment :: Console
  Intended Audience :: Science/Research
  License :: OSI Approved :: BSD License
  Natural Language :: English
  Programming Language :: Python :: 2.6
  Programming Language :: Python :: 2.7
  Programming Language :: Python :: 3.3
  Programming Language :: Python :: 3.4

You are using pip version 8.1.2, however version 9.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
%%bash
cat ~/.sparkmagic/config.json

{
  "kernel_python_credentials" : {
    "username": "",
    "password": "",
    "url": "http://spark:8998",
    "auth": "None"
  },

  "kernel_scala_credentials" : {
    "username": "",
    "password": "",
    "url": "http://spark:8998",
    "auth": "None"
  },
  "kernel_r_credentials": {
    "username": "",
    "password": "",
    "url": "http://spark:8998"
  },

  "logging_config": {
    "version": 1,
    "formatters": {
      "magicsFormatter": { 
        "format": "%(asctime)s\t%(levelname)s\t%(message)s",
        "datefmt": ""
      }
    },
    "handlers": {
      "magicsHandler": { 
        "class": "hdijupyterutils.filehandler.MagicsFileHandler",
        "formatter": "magicsFormatter",
        "home_path": "~/.sparkmagic"
      }
    },
    "loggers": {
      "magicsLogger": { 
        "handlers": ["magicsHandler"],
        "level": "DEBUG",
        "propagate": 0
      }
    }
  },

  "wait_for_idle_timeout_seconds": 15,
  "livy_session_startup_timeout_seconds": 60,

  "fatal_error_suggestion": "The code failed because of a fatal error:\n\t{}.\n\nSome things to try:\na) Make sure Spark has enough available resources for Jupyter to create a Spark context.\nb) Contact your Jupyter administrator to make sure the Spark magics library is configured correctly.\nc) Restart the kernel.",

  "ignore_ssl_errors": false,

  "session_configs": {
    "driverMemory": "1000M",
    "executorCores": 2
  },

  "use_auto_viz": true,
  "coerce_dataframe": true,
  "max_results_sql": 2500,
  "pyspark_dataframe_encoding": "utf-8",

  "heartbeat_refresh_seconds": 30,
  "livy_server_heartbeat_timeout_seconds": 0,
  "heartbeat_retry_seconds": 10,

  "server_extension_default_kernel_name": "pysparkkernel",
  "custom_headers": {},

  "retry_policy": "configurable",
  "retry_seconds_to_sleep_list": [0.2, 0.5, 1, 3, 5],
  "configurable_retry_policy_max_retries": 8
}
aggFTW commented 7 years ago

Cool. Just so you know, you need to do git pull jupyter-incubator master from a clone if you want to merge what's on here to your clone. Don't know what level of expertise you have, but I thought it wouldn't hurt to state here for other to come.

I'll close this issue for now since I didn't get replies from the original poster.

tylerxiety commented 5 years ago

I'm having a similar issue that getting "Error sending http request and maximum retry encountered" when I run any code on any kernels.

I have the latest version 0.12.7, and copied the sample config.json file. I have jupyter and Spark installed by anaconda, so didn't use Docker. error

ghoshkunal123 commented 5 years ago

I have had the same issue from Jupyter Notebook from SageMaker, just by restarting the kernal the problem went away.

surbhijain-zomato commented 4 years ago

I'm having a similar issue that getting "Error sending http request and maximum retry encountered" when I run any code on any kernels.

I have the latest version 0.12.7, and copied the sample config.json file. I have jupyter and Spark installed by anaconda, so didn't use Docker. error

I am facing the same issue. Any solution to this?

wtfzambo commented 4 years ago

Same shit here as @surbhijain-zomato.

It was working a couple weeks ago, now for some reason it gives me this bug, altho nothing changed on my side.

EDIT: Ok, I seem to have fixed it by simply stopping the notebook (not just the kernel, but the whole notebook itself) and restarting it. https://stackoverflow.com/a/59666894/12127578

Data-Jack commented 3 years ago

I'm having a similar issue that getting "Error sending http request and maximum retry encountered" when I run any code on any kernels.

I have the latest version 0.12.7, and copied the sample config.json file. I have jupyter and Spark installed by anaconda, so didn't use Docker. error

@surbhijain-zomato, @tylerxiety

I experienced this same problem from within sagemaker studio. I did the below and if fixed the problem for me but has anyone found a solution that does not involve manually editing a config file?? It's really annoying to do.

From ~/.sparkmagic/logs found that it was trying to connect with address "http://ip-10-0-0-113.ec2.internal:8998/sessions".

2021-07-28 14:00:30,129 ERROR ReliableHttpClient Request to 'http://ip-10-0-0-113.ec2.internal:8998/sessions' failed with 'HTTPConnectionPool(host='ip-10-0-0-113.ec2.internal', port=8998): Max retries exceeded with url: /sessions (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f10145d6350>: Failed to establish a new connection: [Errno -2] Name or service not known'))'

In a terminal session within the image i installed nano and edited /etc/sparkmagic/config.jso to contain url "http://10.0.0.113:8998" ( see below) and now it works for me.

  "kernel_python_credentials": {
    "username": "Livy",
    "password": "",
    "url": "http://10.0.0.113:8998",
    "auth": "None"
  },
lucharo commented 2 years ago

Hitting the same issue! My config file is not being read!