jupyter / nbclient

A client library for executing notebooks. Formally nbconvert's ExecutePreprocessor
https://nbclient.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
150 stars 56 forks source link

Errors after upgrading to 0.3.0 #58

Open palewire opened 4 years ago

palewire commented 4 years ago

Some users following a pattern similar to the one described in #48 are getting new errors after upgrading to 0.3. When they downgrade to 0.2, the errors are gone. Here's a sample traceback.

Traceback (most recent call last):
  File "update.py", line 154, in <module>
    cli()
  File "/Users/apesce/.local/share/virtualenvs/coronavirus-tracker-dljAgG8F/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/Users/apesce/.local/share/virtualenvs/coronavirus-tracker-dljAgG8F/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/Users/apesce/.local/share/virtualenvs/coronavirus-tracker-dljAgG8F/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/apesce/.local/share/virtualenvs/coronavirus-tracker-dljAgG8F/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/apesce/.local/share/virtualenvs/coronavirus-tracker-dljAgG8F/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "update.py", line 85, in process
    _execute_notebook(
  File "update.py", line 29, in _execute_notebook
    client.execute()
  File "/Users/apesce/.local/share/virtualenvs/coronavirus-tracker-dljAgG8F/lib/python3.8/site-packages/nbclient/util.py", line 37, in wrapped
    result = loop.run_until_complete(coro(self, *args, **kwargs))
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/base_events.py", line 612, in run_until_complete
    return future.result()
  File "/Users/apesce/.local/share/virtualenvs/coronavirus-tracker-dljAgG8F/lib/python3.8/site-packages/nbclient/client.py", line 471, in async_execute
    async with self.async_setup_kernel(**kwargs):
  File "/Users/apesce/.local/share/virtualenvs/coronavirus-tracker-dljAgG8F/lib/python3.8/site-packages/async_generator/_util.py", line 34, in __aenter__
    return await self._agen.asend(None)
  File "/Users/apesce/.local/share/virtualenvs/coronavirus-tracker-dljAgG8F/lib/python3.8/site-packages/nbclient/client.py", line 440, in async_setup_kernel
    await self.async_start_new_kernel_client(**kwargs)
  File "/Users/apesce/.local/share/virtualenvs/coronavirus-tracker-dljAgG8F/lib/python3.8/site-packages/nbclient/client.py", line 393, in async_start_new_kernel_client
    await ensure_async(self.kc.start_channels())
  File "/Users/apesce/.local/share/virtualenvs/coronavirus-tracker-dljAgG8F/lib/python3.8/site-packages/jupyter_client/client.py", line 116, in start_channels
    self.hb_channel.start()
  File "/Users/apesce/.local/share/virtualenvs/coronavirus-tracker-dljAgG8F/lib/python3.8/site-packages/jupyter_client/asynchronous/client.py", line 97, in hb_channel
    loop = asyncio.new_event_loop()
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/events.py", line 758, in new_event_loop
    return get_event_loop_policy().new_event_loop()
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/events.py", line 656, in new_event_loop
    return self._loop_factory()
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/unix_events.py", line 54, in __init__
    super().__init__(selector)
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/selector_events.py", line 61, in __init__
    self._make_self_pipe()
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/selector_events.py", line 108, in _make_self_pipe
    self._ssock, self._csock = socket.socketpair()
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/socket.py", line 571, in socketpair
    a, b = _socket.socketpair(family, type, proto)
OSError: [Errno 24] Too many open files
Exception ignored in: <function BaseEventLoop.__del__ at 0x1065fec10>
Traceback (most recent call last):
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/base_events.py", line 652, in __del__
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/unix_events.py", line 58, in close
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/selector_events.py", line 92, in close
  File "/usr/local/Cellar/python@3.8/3.8.1/Frameworks/Python.framework/Versions/3.8/lib/python3.8/asyncio/selector_events.py", line 99, in _close_self_pipe
AttributeError: '_UnixSelectorEventLoop' object has no attribute '_ssock'
make[1]: *** [process] Error 1
make: *** [update] Error 2
(coronavirus-tracker) apesce@anthonys-mbp coronavirus-tracker % [IPKernelApp] WARNING | Parent appears to have exited, shutting down.
[IPKernelApp] WARNING | Parent appears to have exited, shutting down.
[IPKernelApp] WARNING | Parent appears to have exited, shutting down.
[IPKernelApp] WARNING | Parent appears to have exited, shutting down.
[IPKernelApp] WARNING | Parent appears to have exited, shutting down.
[IPKernelApp] WARNING | Parent appears to have exited, shutting down.
[IPKernelApp] WARNING | Parent appears to have exited, shutting down.
choldgraf commented 4 years ago

Hmm I wonder if this is related to some of the async stuff that @davidbrochart had worked on in the last few PRs?

choldgraf commented 4 years ago

@palewire could you provide an example of a notebook or code snippet that reliably creates the problem?

palewire commented 4 years ago

I am unable to replicate the bug, which we've only seen so far on Macbooks running Python 3.8 installed via Homebrew. I'm a neckbeard who uses Ubuntu 16.04, and I'm not seeing any bugs.

Here's basically what's happening in the code. We have this function to run notebooks via nbclient:

def _execute_notebook(name, path):
    """
    Private method to execute the provided notebook and handle errors.
    """
    input_path = f"{name}.ipynb"
    output_path = f"{name}-output.ipynb"
    with open(input_path) as f:
        nb = nbformat.read(f, as_version=4)
    client = NotebookClient(
        nb,
        timeout=600,
        kernel_name='python3',
        allow_errors=False,
        force_raise_errors=True,
        resources={'metadata': {'path': path}}
    )
    try:
        client.execute()
    except CellExecutionError:
        out = None
        msg = f'Error executing the notebook "{input_path}".\n\n'
        msg += f'See notebook "{input_path}" for the traceback.'
        print(msg)
        raise
    finally:
        with open(output_path, mode='w', encoding='utf-8') as f:
            nbformat.write(nb, f)

Then we feed notebook paths into the function from a list. It's something like this:

def run():
    print("Running notebooks")
    notebook_list = [
       "notebook-1",
       "notebook-2",
       "notebook-3",
       "notebook-4",
    ]
    for notebook in notebook_list:
        notebook_filename = f'./_notebooks/{notebook}'
        print(f"- {notebook_filename}.ipynb")
        _execute_notebook(
            notebook_filename,
            path='_notebooks/'
        )
MSeal commented 4 years ago

Hmm I'm unlikely to be able to help because I have an older ubuntu 18 and a new ubuntu 20 machine as well. Though maybe setting the ulimit lower and running might reproduce since OSX has a lot lower default limit. I can give it a try later in the week if someone else hasn't narrowed it down. Is the error being seen consistent or sporadic? Is it after a few notebooks have run or on the first one? Might be we're not cleaning up file handles somewhere in the dependency chain.

davidbrochart commented 4 years ago

As @MSeal suggested, I could reproduce the bug on Ubuntu 20.04 with python3.8 by setting ulimit -n 16, although it crashed in the shell channel creation (in ZMQ) instead of the heart beat channel (in asyncio.new_event_loop). But both are caused by a new socket creation, so increasing the ulimit might solve the issue, provided that we are not leaking socket allocation.

choldgraf commented 4 years ago

One thing we could do is start running tests on an osx (or even windows) build on github actions. At least once we've got the bug reproducible in test form.

davidbrochart commented 4 years ago

Good point, sounds like a good idea in any case.

palewire commented 4 years ago

This error was encountered by three different users on three different Macbooks, all running the same code.

MSeal commented 4 years ago

I am guessing a ulimit of 256, 512, or 1024 is more accurate for a given Mac. From reading up the default and max ulimits have shifted around in this range on Mac depending on the OS version and the available RAM.

SylvainCorlay commented 4 years ago

This is a recurrent issue, with e.g. multiple kernels in JLab.

MSeal commented 4 years ago

Likely related to the comments near the end of this thread: https://github.com/jupyter/jupyter_client/pull/548

chrisjsewell commented 4 years ago

yep got another occurence in executablebooks/jupyter-book#867