jupyter-server / kernel_gateway

Jupyter Kernel Gateway
http://jupyter-kernel-gateway.readthedocs.org/en/latest/
Other
501 stars 131 forks source link

jupyter-kernel-gateway hanging/slow #376

Closed 3goats closed 1 year ago

3goats commented 1 year ago

Description

I've followed this documentation exactly. https://jupyter-kernel-gateway.readthedocs.io/en/latest/getting-started.html#install and have worked through the first example which basically proves the basic functionality.

Reproduce

When I run the curl http://127.0.0.1:10100/hello/world command the server shows that a connection has been received:

[I 230206 16:01:49 web:2271] 200 GET /hello/world (127.0.0.1) 640666.69ms

However the response never gets returned, it just hangs indefinitely. Here are my versions:

Selected Jupyter core packages... IPython : 8.9.0 ipykernel : 6.21.1 ipywidgets : not installed jupyter_client : 8.0.2 jupyter_core : 5.2.0 jupyter_server : 2.2.1 jupyterlab : 3.6.0rc0 nbclient : 0.7.2 nbconvert : 7.2.9 nbformat : 5.7.3 notebook : 6.5.2 qtconsole : not installed traitlets : 5.9.0

jupyter kernelgateway: 2.5.1

I'm running on an M! Mac but I have the same issue on Ubuntu 20.10.

I also tried the basic node client from here: (https://github.com/jupyter/kernel_gateway_demos/blob/master/node_client_example/src/client.js)

This sort of works, but it's extremely slow - a simple `print("Hello World")' takes around 30-40 seconds to resopond.

welcome[bot] commented 1 year ago

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! :hugs:
If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively. welcome You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! :wave:
Welcome to the Jupyter community! :tada:

3goats commented 1 year ago

Sorry about the early post - accidentally hit enter too early

kevin-bates commented 1 year ago

Hi @carlskii - this repo is "lightly maintained" so bear with us. Would you be able to restart your JKG server with --debug and provide the entire console output from startup through the completion of the first GET /hello/world request? Perhaps that will provide some clues.

3goats commented 1 year ago

Sure here you go:

KernelGatewayApp] Connecting to: tcp://127.0.0.1:61680
[KernelGatewayApp] Registering resource: /hello/world, methods: (['GET'])
[KernelGatewayApp] Registering resource: /_api/spec/swagger.json, methods: (GET)
[KernelGatewayApp] Jupyter Kernel Gateway at http://127.0.0.1:10100
[D 230206 16:23:20 selector_events:54] Using selector: KqueueSelector
[D 230206 16:23:43 handlers:193] Request code for notebook cell is: REQUEST = "{\"body\": \"\", \"args\": {}, \"path\": {}, \"headers\": {\"Host\": \"127.0.0.1:10100\", \"User-Agent\": \"curl/7.85.0\", \"Accept\": \"*/*\"}}"
[KernelGatewayApp] connecting shell channel to tcp://127.0.0.1:61679
[KernelGatewayApp] Connecting to: tcp://127.0.0.1:61679
[KernelGatewayApp] activity on 73346505-7e51-429e-913a-690c1000429a: status (busy)
[KernelGatewayApp] activity on 73346505-7e51-429e-913a-690c1000429a: execute_input
[KernelGatewayApp] activity on 73346505-7e51-429e-913a-690c1000429a: status (idle)

Seems to back to idle but doesn't return anything back to the client, so the client just sits there waiting forever..

kevin-bates commented 1 year ago

Sigh. The logging is not using a helpful format string that would contain timestamps. Also, this output is not including the startup of JKG - where the kernel is actually started IIUC - although, without timestamps, this is essentially useless. That said, the fact there are 13 seconds(!) between trace statements for those that DO provide timestamps is alarming.

3goats commented 1 year ago

Here's the entire startup output from JKG:

jupyter kernelgateway --KernelGatewayApp.api=kernel_gateway.notebook_http --KernelGatewayApp.seed_uri=./my_example.ipynb host=127.0.0.1 --port=10100 --debug 
[KernelGatewayApp] Searching ['/Users/cbourne/.local/share/virtualenvs/Jupyter-q0MWmKx8/etc/jupyter', '/Users/cbourne/.jupyter', '/usr/local/etc/jupyter', '/etc/jupyter'] for config files
[KernelGatewayApp] Looking for jupyter_config in /etc/jupyter
[KernelGatewayApp] Looking for jupyter_config in /usr/local/etc/jupyter
[KernelGatewayApp] Looking for jupyter_config in /Users/cbourne/.jupyter
[KernelGatewayApp] Looking for jupyter_config in /Users/cbourne/.local/share/virtualenvs/Jupyter-q0MWmKx8/etc/jupyter
[KernelGatewayApp] Looking for jupyter_kernel_gateway_config in /etc/jupyter
[KernelGatewayApp] Looking for jupyter_kernel_gateway_config in /usr/local/etc/jupyter
[KernelGatewayApp] Looking for jupyter_kernel_gateway_config in /Users/cbourne/.jupyter
[KernelGatewayApp] Looking for jupyter_kernel_gateway_config in /Users/cbourne/.local/share/virtualenvs/Jupyter-q0MWmKx8/etc/jupyter
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
[KernelGatewayApp] Instantiating kernel 'Python 3 (ipykernel)' with kernel provisioner: local-provisioner
[KernelGatewayApp] Starting kernel: ['/Users/cbourne/.local/share/virtualenvs/Jupyter-q0MWmKx8/bin/python', '-m', 'ipykernel_launcher', '-f', '/Users/cbourne/Library/Jupyter/runtime/kernel-2db92d97-10a7-4779-987f-591711bf39eb.json']
[KernelGatewayApp] Connecting to: tcp://127.0.0.1:62683
[KernelGatewayApp] Connecting to: tcp://127.0.0.1:62680
[KernelGatewayApp] Kernel started: 2db92d97-10a7-4779-987f-591711bf39eb, name: python3
[KernelGatewayApp] Kernel args: {'kernel_name': 'python3'}
[KernelGatewayApp] connecting iopub channel to tcp://127.0.0.1:62680
[KernelGatewayApp] Connecting to: tcp://127.0.0.1:62680
[KernelGatewayApp] connecting shell channel to tcp://127.0.0.1:62679
[KernelGatewayApp] Connecting to: tcp://127.0.0.1:62679
[KernelGatewayApp] connecting stdin channel to tcp://127.0.0.1:62681
[KernelGatewayApp] Connecting to: tcp://127.0.0.1:62681
[KernelGatewayApp] connecting heartbeat channel to tcp://127.0.0.1:62682
[KernelGatewayApp] connecting control channel to tcp://127.0.0.1:62683
[KernelGatewayApp] Connecting to: tcp://127.0.0.1:62683
0.00s - Debugger warning: It seems that frozen modules are being used, which may
0.00s - make the debugger miss breakpoints. Please pass -Xfrozen_modules=off
0.00s - to python to disable frozen modules.
0.00s - Note: Debugging will proceed. Set PYDEVD_DISABLE_FILE_VALIDATION=1 to disable this validation.
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: status (starting)
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: status (busy)
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: status (idle)
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: status (busy)
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: status (idle)
[KernelGatewayApp] Destroying zmq context for <jupyter_client.blocking.client.BlockingKernelClient object at 0x110c6a310>
[KernelGatewayApp] Connecting to: tcp://127.0.0.1:62680
[KernelGatewayApp] Registering resource: /hello/world, methods: (['GET'])
[KernelGatewayApp] Registering resource: /_api/spec/swagger.json, methods: (GET)
[KernelGatewayApp] Jupyter Kernel Gateway at http://127.0.0.1:10100
[D 230206 17:09:54 selector_events:54] Using selector: KqueueSelector
[D 230206 17:10:10 handlers:193] Request code for notebook cell is: REQUEST = "{\"body\": \"\", \"args\": {}, \"path\": {}, \"headers\": {\"Host\": \"127.0.0.1:10100\", \"User-Agent\": \"curl/7.85.0\", \"Accept\": \"*/*\"}}"
[KernelGatewayApp] connecting shell channel to tcp://127.0.0.1:62679
[KernelGatewayApp] Connecting to: tcp://127.0.0.1:62679
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: status (busy)
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: execute_input
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: status (idle)
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: status (busy)
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: execute_input
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: stream
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: status (idle)
[I 230206 17:12:57 web:2271] 200 GET /hello/world (127.0.0.1) 167454.21ms
[D 230206 17:12:57 handlers:193] Request code for notebook cell is: REQUEST = "{\"body\": \"\", \"args\": {}, \"path\": {}, \"headers\": {\"Host\": \"127.0.0.1:10100\", \"User-Agent\": \"curl/7.85.0\", \"Accept\": \"*/*\"}}"
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: status (busy)
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: execute_input
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: status (idle)
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: status (busy)
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: execute_input
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: stream
[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: status (idle)
3goats commented 1 year ago

It seems that the stream is not being sent back to the client. If I terminate the client, that corresponds with this:

[KernelGatewayApp] activity on 2db92d97-10a7-4779-987f-591711bf39eb: stream
kevin-bates commented 1 year ago

Thanks for the update.

It seems that the stream is not being sent back to the client.

That's interesting. I'll try looking into this a bit but can't make any guarantees of a timely response.

3goats commented 1 year ago

Thanks - bit of a bummer really. Was hoping to use this for a project and keep everything In a Jupyter context.

kevin-bates commented 1 year ago

FWIW - adding this option to the command line will produce better log output:

--KernelGatewayApp.log_format="[%(levelname)1.1s %(asctime)s.%(msecs).03d %(name)s] %(message)s"

I can reproduce this as well. I'm a little suspicious that using the older notebook (non-async) kernel manager with more recent (async-oriented) packages might be problematic here. JKG should really be updated to use classes from jupyter_server rather than notebook.

In fact, if I downgrade jupyter_client to 6.1.12 (e.g., pip uninstall -y jupyter_client; pip install "jupyter_client==6.1.12") I get the following (with the better logging):

[D 230206 09:47:47 handlers:193] Request code for notebook cell is: REQUEST = "{\"body\": \"\", \"args\": {}, \"path\": {}, \"headers\": {\"Host\": \"127.0.0.1:10100\", \"User-Agent\": \"curl/7.85.0\", \"Accept\": \"*/*\"}}"
[D 2023-02-06 09:47:47.417 KernelGatewayApp] connecting shell channel to tcp://127.0.0.1:56529
[D 2023-02-06 09:47:47.417 KernelGatewayApp] Connecting to: tcp://127.0.0.1:56529
[D 2023-02-06 09:47:47.436 KernelGatewayApp] activity on 34d85969-8e3e-4297-808d-f54bd7392950: status (busy)
[D 2023-02-06 09:47:47.437 KernelGatewayApp] activity on 34d85969-8e3e-4297-808d-f54bd7392950: execute_input
[D 2023-02-06 09:47:47.441 KernelGatewayApp] activity on 34d85969-8e3e-4297-808d-f54bd7392950: status (idle)
[D 2023-02-06 09:47:47.445 KernelGatewayApp] activity on 34d85969-8e3e-4297-808d-f54bd7392950: status (busy)
[D 2023-02-06 09:47:47.446 KernelGatewayApp] activity on 34d85969-8e3e-4297-808d-f54bd7392950: execute_input
[D 2023-02-06 09:47:53.880 KernelGatewayApp] activity on 34d85969-8e3e-4297-808d-f54bd7392950: stream
[I 230206 09:47:53 web:2271] 200 GET /hello/world (127.0.0.1) 6466.12ms
[D 2023-02-06 09:47:53.883 KernelGatewayApp] activity on 34d85969-8e3e-4297-808d-f54bd7392950: status (idle)

and the client shows...

$ curl http://127.0.0.1:10100/hello/world
{"data": [[-0.7755802543971934, -1.3531463685266123, -2.0013574397739187, 1.1691946363212657], [-0.32001123702826173, -1.1341102786710193, -1.7607704433669884, 1.9435639143248997], [-0.7528384930476039, 1.5834637636811166, -0.2065888198514187, -1.5650623523825595], [1.0773472994609048, -0.4779716903968818, -0.8962265215501868, -1.0113640255538574], [0.36807503816462533, -1.7393131663759562, 1.1916932430746439, 1.6062664627895835]], "request": {"body": "", "args": {}, "path": {}, "headers": {"Host": "127.0.0.1:10100", "User-Agent": "curl/7.85.0", "Accept": "*/*"}}}

If I run the same curl command a second time, the 6.46 seconds goes down to 15 ms...

[D 230206 09:52:39 handlers:193] Request code for notebook cell is: REQUEST = "{\"body\": \"\", \"args\": {}, \"path\": {}, \"headers\": {\"Host\": \"127.0.0.1:10100\", \"User-Agent\": \"curl/7.85.0\", \"Accept\": \"*/*\"}}"
[D 2023-02-06 09:52:39.317 KernelGatewayApp] activity on 34d85969-8e3e-4297-808d-f54bd7392950: status (busy)
[D 2023-02-06 09:52:39.317 KernelGatewayApp] activity on 34d85969-8e3e-4297-808d-f54bd7392950: execute_input
[D 2023-02-06 09:52:39.319 KernelGatewayApp] activity on 34d85969-8e3e-4297-808d-f54bd7392950: status (idle)
[D 2023-02-06 09:52:39.323 KernelGatewayApp] activity on 34d85969-8e3e-4297-808d-f54bd7392950: status (busy)
[D 2023-02-06 09:52:39.324 KernelGatewayApp] activity on 34d85969-8e3e-4297-808d-f54bd7392950: execute_input
[D 2023-02-06 09:52:39.325 KernelGatewayApp] activity on 34d85969-8e3e-4297-808d-f54bd7392950: stream
[D 2023-02-06 09:52:39.326 KernelGatewayApp] activity on 34d85969-8e3e-4297-808d-f54bd7392950: status (idle)
[I 230206 09:52:39 web:2271] 200 GET /hello/world (127.0.0.1) 15.22ms

So http-mode has some startup cost.

kevin-bates commented 1 year ago

Hi @carlskii - in looking at bringing this package up to date, I found that using jupyter_client >= 8 is the issue because some methods have been removed from the KernelClient class or are now async, etc.

I uninstall jupyter_client again, then installed jupyter_client<8. Once that was done the INITIAL curl response took place in 200.43ms, with a subsequent request in 13.55ms. (It's interesting that the startup cost when using juptyer_client 6.x is so much greater but not worth our time at this point.)

[D 230206 10:03:35 handlers:193] Request code for notebook cell is: REQUEST = "{\"body\": \"\", \"args\": {}, \"path\": {}, \"headers\": {\"Host\": \"127.0.0.1:10100\", \"User-Agent\": \"curl/7.85.0\", \"Accept\": \"*/*\"}}"
[D 2023-02-06 10:03:35.882 KernelGatewayApp] connecting shell channel to tcp://127.0.0.1:56744
[D 2023-02-06 10:03:35.882 KernelGatewayApp] Connecting to: tcp://127.0.0.1:56744
[D 2023-02-06 10:03:35.900 KernelGatewayApp] activity on dafa6552-260d-4501-a764-56b67fabe642: status (busy)
[D 2023-02-06 10:03:35.901 KernelGatewayApp] activity on dafa6552-260d-4501-a764-56b67fabe642: execute_input
[D 2023-02-06 10:03:35.903 KernelGatewayApp] activity on dafa6552-260d-4501-a764-56b67fabe642: status (idle)
[D 2023-02-06 10:03:35.908 KernelGatewayApp] activity on dafa6552-260d-4501-a764-56b67fabe642: status (busy)
[D 2023-02-06 10:03:35.909 KernelGatewayApp] activity on dafa6552-260d-4501-a764-56b67fabe642: execute_input
[D 2023-02-06 10:03:36.079 KernelGatewayApp] activity on dafa6552-260d-4501-a764-56b67fabe642: stream
[D 2023-02-06 10:03:36.081 KernelGatewayApp] activity on dafa6552-260d-4501-a764-56b67fabe642: status (idle)
[I 230206 10:03:36 web:2271] 200 GET /hello/world (127.0.0.1) 200.43ms
[D 230206 10:03:42 handlers:193] Request code for notebook cell is: REQUEST = "{\"body\": \"\", \"args\": {}, \"path\": {}, \"headers\": {\"Host\": \"127.0.0.1:10100\", \"User-Agent\": \"curl/7.85.0\", \"Accept\": \"*/*\"}}"
[D 2023-02-06 10:03:42.491 KernelGatewayApp] activity on dafa6552-260d-4501-a764-56b67fabe642: status (busy)
[D 2023-02-06 10:03:42.491 KernelGatewayApp] activity on dafa6552-260d-4501-a764-56b67fabe642: execute_input
[D 2023-02-06 10:03:42.493 KernelGatewayApp] activity on dafa6552-260d-4501-a764-56b67fabe642: status (idle)
[D 2023-02-06 10:03:42.497 KernelGatewayApp] activity on dafa6552-260d-4501-a764-56b67fabe642: status (busy)
[D 2023-02-06 10:03:42.497 KernelGatewayApp] activity on dafa6552-260d-4501-a764-56b67fabe642: execute_input
[D 2023-02-06 10:03:42.498 KernelGatewayApp] activity on dafa6552-260d-4501-a764-56b67fabe642: stream
[D 2023-02-06 10:03:42.500 KernelGatewayApp] activity on dafa6552-260d-4501-a764-56b67fabe642: status (idle)
[I 230206 10:03:42 web:2271] 200 GET /hello/world (127.0.0.1) 13.55ms

So, for now, please downgrade your jupyter_client to jupyter_client<8, and I think you should be okay.

This package is woefully out of date and making it "current" will take some effort - although nothing unlike we've had to do in other parts of the Jupyter ecosystem - so there should be precedent for anything we need to do.

3goats commented 1 year ago

Thanks so much for your help. Wondering if I could just start a Jupyter Lab server and reverse engineer the calls that are used in that rather than using JKG? Or maybe the downgrade and wait for the updated JKG is the better way to go.

kevin-bates commented 1 year ago

Not sure exactly what you want to do, but I think the http-mode provided by JKG is rather unique, but it's up to you. At least the downgrade of jupyter_client approach provides a workaround. Since you should probably run the gateway projects (JKG and JEG) in a different python env, I suspect that using a slightly older jupyter_client there would be fairly stable.

3goats commented 1 year ago

Yeah I'll stick with JKG for now - I tried jupyter_client v7.4.9 and that indeed is working.

3goats commented 1 year ago

Seems to be working pretty well with the older client. Silly quick question:

The web socket demo (nodeJS) is returning my data like this:

metadata: {},
  content: {
    name: 'stdout',
    text: '\x1B[1;39m{\r\n' +
      '  \x1B[0m\x1B[34;1m"id"\x1B[0m\x1B[1;39m: \x1B[0m\x1B[0;32m"0ee772d0-958d-11ed-9148-f1a6c7cdfb41"\x1B[0m\x1B[1;39m,\r\n' +
      '  \x1B[0m\x1B[34;1m"certificateAuthority"\x1B[0m\x1B[1;39m: \x1B[0m\x1B[0;32m"BUILTIN"\x1B[0m\x1B[1;39m,\r\n' +
      '  \x1B[0m\x1B[34;1m"certificateAuthorityAccountId"\x1B[0m\x1B[1;39m: \x1B[0m\x1B[0;32m"0ee724b0-958d-11ed-9148-f1a6c7cdfb41"\x1B[0m\x1B[1;39m\r\n' +
      '\x1B[1;39m}\x1B[0m\r\n'
  },
  buffers: [],

Do you know what encoding is being used - can't seem to decode it. Seems that the python demo deals with the decoding. i.e. looks like this:

Received message type: stream
  Content: {
  "id": "0ee772d0-958d-11ed-9148-f1a6c7cdfb41",
  "certificateAuthority": "BUILTIN",
  "certificateAuthorityAccountId": "0ee724b0-958d-11ed-9148-f1a6c7cdfb41"
}
kevin-bates commented 1 year ago

@carlskii - I've created JKG 2.5.2 that includes the cap to jupyter_client and the log formatting changes.

As a result, I'm going to close this issue as fixed by #378 as 2.5.2 should be minimally viable for now.

We'll try to circle back to bring this package up to date at a later time when we address #377.

Sorry for the inconvenience.

3goats commented 1 year ago

Excellent thanks.

kevin-bates commented 1 year ago

Hi @carlskii - I just noticed your previous question when submitting my closing comment.

I'm sorry, I'm purely a back-end dev and don't know how to answer your question.