jupyter / jupyter_client

Jupyter protocol client APIs
https://jupyter-client.readthedocs.io
BSD 3-Clause "New" or "Revised" License
383 stars 283 forks source link

Errors with jupyter-client 8.X and 7.X on raw Ubuntu 22.04 container with Python 3.10.1 and 3.9.10 #927

Closed AntoineMorcos closed 1 year ago

AntoineMorcos commented 1 year ago

I have been trying out the newest versions of jupyter-console and jupyter-client in order to find the latest bundle that doesn't give errors. I found the latest sane combination of versions to be jupyter-console==6.4.2 with jupyter-client==6.1.12 The way I tested this was by following these steps (links point to a gist showing my environment, commands and errors)

I repeated the same experiments in a 3.9.10 pyenv and had the same results

kevin-bates commented 1 year ago

Hi @AntoineMorcos. FWIW, I can reproduce the issue related to jupyter-client 8, but cannot reproduce the issue with jupyter_client <= 7 (aside from jupyter-client==6.1.13 - which was yanked so that issue is expected). Using jupyter-console==6.4.3 and jupyter_client==7.4.9, I'm able to hold the "Return" button down for quite some time and then get appropriate output with something like import os; os.environ:

(there are dozens of previous lines of empty prompts)
In [1]: 

In [1]: 

In [1]: 

In [1]: 

In [1]: import os; os.environ
Out[1]: 
environ{'HOSTNAME': '1858c08e59a2',
        'PWD': '/',
        'HOME': '/root',
        'TERM': 'xterm-color',
        'SHLVL': '1',
        'PATH': '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin',
        '_': '/usr/local/bin/jupyter',
        'LC_CTYPE': 'C.UTF-8',
        'PYDEVD_USE_FRAME_EVAL': 'NO',
        'JPY_PARENT_PID': '3927',
        'CLICOLOR': '1',
        'FORCE_COLOR': '1',
        'CLICOLOR_FORCE': '1',
        'PAGER': 'cat',
        'GIT_PAGER': 'cat',
        'MPLBACKEND': 'module://matplotlib_inline.backend_inline'}

In [2]: 
root@1858c08e59a2:/# pip freeze
asttokens==2.2.1
backcall==0.2.0
comm==0.1.2
debugpy==1.6.6
decorator==5.1.1
entrypoints==0.4
executing==1.2.0
ipykernel==6.21.1
ipython==8.9.0
jedi==0.18.2
jupyter-console==6.4.3
jupyter_client==7.4.9
jupyter_core==5.2.0
matplotlib-inline==0.1.6
nest-asyncio==1.5.6
packaging==23.0
parso==0.8.3
pexpect==4.8.0
pickleshare==0.7.5
platformdirs==2.6.2
prompt-toolkit==3.0.36
psutil==5.9.4
ptyprocess==0.7.0
pure-eval==0.2.2
Pygments==2.14.0
python-dateutil==2.8.2
pyzmq==25.0.0
six==1.16.0
stack-data==0.6.2
tornado==6.2
traitlets==5.9.0
wcwidth==0.2.6
AntoineMorcos commented 1 year ago

Hi @kevin-bates , thanks for your response! Would you mind sharing your python version and OS release please ? I am hoping that running my experiments on a fresh ubuntu container would help limit the possibilities, do you have an idea if some part of my setup may be the problem ?

kevin-bates commented 1 year ago

I'm using the ubuntu:latest container. Here's the information:

root@2197df282db9:/# python3 --version
Python 3.10.6
root@2197df282db9:/# uname -a
Linux 2197df282db9 5.15.78-0-virt #1-Alpine SMP Fri, 11 Nov 2022 10:19:45 +0000 x86_64 x86_64 x86_64 GNU/Linux
root@2197df282db9:/# cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.1 LTS"

To prevent having to reinstall python for additional test attempts, I committed that image after python's installation as kbates/ubuntu:py-3.10.6 and pushed it to docker hub. You should be able to pull that image, run it with bash, then pip install "jupyter_client<8" jupyter_console and see a working jupyter console.

AntoineMorcos commented 1 year ago

@kevin-bates Thanks for that, I started a container with this image, at first it seemed like I couldn't reproduce with it but then I was able to reproduce 3 or 4 times within a few minutes

AntoineMorcos commented 1 year ago

Screen recording to show the crash using the kbates/ubuntu:py-3.10.6 image https://user-images.githubusercontent.com/6024261/216728206-7f4c1b18-67d4-423b-806c-a12fab0d08cb.mp4

kevin-bates commented 1 year ago

After holding down the Return key to produce 116 prompts, I did receive the asyncio events exception to which you refer and, when this happens, I am unable to break out of the console application and have to terminate my container instance.[*]

Since this looks like the process is entirely in the kernel, I decided to try this with the xeus-python kernel to see if it my behave differently.

$ pip install xeus-python
$ jupyter console --kernel=xpython

With this, the issue reproduced about 140 prompts, but at least with this, the command prompt was restored after producing this output (although I'm still unable to exit the console the prompt, so the traditional "exit commands" are not working):

Unhandled exception in event loop:
  File "/usr/lib/python3.10/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)

Exception cannot enter context: <_contextvars.Context object at 0x7f54241434c0> is already entered
Press ENTER to continue...
In [1]: 

Unhandled exception in event loop:
  File "/usr/lib/python3.10/asyncio/events.py", line 80, in _run
    self._context.run(self._callback, *self._args)
  File "/usr/local/lib/python3.10/dist-packages/prompt_toolkit/application/application.py", line 707, in read_from_input
    self.key_processor.process_keys()
  File "/usr/local/lib/python3.10/dist-packages/prompt_toolkit/key_binding/key_processor.py", line 270, in process_keys
    self._process_coroutine.send(key_press)

Exception generator already executing
Press ENTER to continue...

In [1]: 

In [1]: 

In [1]: 

I suspect the underlying issue is related to asyncio and the kernel processes' interaction with it, just looking at the stack traces. I don't think, based on the point this is happening, that jupyter_client/consoleapp.py has much to do with this, but this is definitely not an area in which I'm very familiar and I will need to defer to someone else at this point.


[*] If I exec into the container from another shell, I can issue a process-status:

root@4a49af285a81:/# ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 23:38 pts/0    00:00:00 bash
root        41     1  0 23:40 pts/0    00:00:05 /usr/bin/python3 /usr/local/bin/jupyter-console --kernel=xpython
root        45    41  0 23:40 ?        00:00:01 /usr/bin/python3 -m xpython_launcher -f /root/.local/share/jupyter/runtime/kernel-41.json
root        61     0  1 23:53 pts/1    00:00:00 bash
root        69    61  0 23:53 pts/1    00:00:00 ps -ef

so there might be some further troubleshooting you can do with this and/or keep the container running this way by terminating (and restarting) the console app.

((Currently, after being in the container for a bit, I can honestly say I can't reproduce this - at least after multiple hundreds of prompts (750 and counting) - so this is quickly falling into the highly intermittent category and, again believe you might be treading on a python/asyncio anomaly.))

AntoineMorcos commented 1 year ago

@kevin-bates I tried with xeus-python and it does seem less likely to happen but I was still able to reproduce quickly multiple times. I noticed the error was more likely to occur if I type the command "jupyter console --kernel=xterm" and then hold Return straightaway. It seems like a race condition at initialization. Attaching screen recording for reference. Having said that, with ipython kernel I was still able to reproduce the error frequently when I left the terminal 5-7 seconds before holding Return so in that case it seems to extend beyond initialization, unless init normally takes multiple seconds.

I'm aware that holding the Return key isn't the usual use of the terminal but I believe it might help identify the issue since there is no other known way to reliably reproduce it.

https://user-images.githubusercontent.com/6024261/216958215-ee66a035-3b50-462d-af85-77f25b09cde7.mp4

kevin-bates commented 1 year ago

Hi @AntoineMorcos - this is beyond my area of expertise. All jupyter console does is start a kernel (via a KernelManager) and initialize the channels (via a KernelClient instance) then hand things off to the corresponding kernel (which is where the [1]: REPL prompt is managed). What's strange is that xeus-python is written in C++, so would be using entirely different underlying libraries - although I would imagine ZMQ would derive from the same code (where the python lib probably uses C++ at some point - but really have no idea).

The other, somewhat non-standard thing jupyter console does is it unconditionally sets up an ssh tunnel - even running locally - and I wonder if that is somehow coming into play.

At any rate, I need to defer to those that know the kernel, KernelClient, and asyncio side of things better than I do and hope they might have some ideas.

cc: @davidbrochart, @blink1073, @JohanMabille

blink1073 commented 1 year ago

I've opened https://github.com/jupyter/jupyter_console/pull/276

kevin-bates commented 1 year ago

Whoa. I thought jupyter console came from jupyter_client via jupyter_client/consoleapp.py! My apologies for potentially steering this exercise off-course!

Thanks @blink1073 for pointing this in the right direction.

blink1073 commented 1 year ago

jupyter console uses that class.

blink1073 commented 1 year ago

I just published https://github.com/jupyter/jupyter_console/releases/tag/v6.5.0 which fixes support for jupyter_client 7 and 8.