Open gagern opened 1 year ago
Using IPython.core.debugger.set_trace
I've traced the synchronous call stack through ipywidgets and comm to ipykernel.iostream.BackgroundSocket.send_multipart and found that the data is still in order at that point. Even monkey-patching the method to pass bytes
objects instead of the (potentially mutable) memoryview
didn't change things, so it doesn't seem to be due to asynchronous mutation of the memory area, and it does seem to be nested fairly deeply in the communication stack. I don't have a good idea for how to trace the asynchronous communication, though, since it's harder to get a pdb attached to a background thread.
Thanks for opening an issue and for the thorough debugging.
My gut feeling would be that tornado has a web socket message size limit that you are reaching, and tornado may be cutting your message. But this hypothesis does not match with the diff you are seeing I suppose.
Plain size limitation doesn't match with the buffer size being the same on both sides of the channel. Must be at least somewhat more subtle to allow length to be passed independently.
Have you been able to reproduce? I'm having a hard time reproducing just now. I could reproduce with near certainty yesterday. Today when I wanted to explore other browsers (original report was using Chrome) and using Wireshark to observe the data over the wire, and suddenly I can't reproduce with any browser. Will try again, wondering what other aspects could come into this, including system load, network connectivity, or perhaps actually bad hardware.
I've been doing large-scale drawing commands with several thousands of move-to and line-to commands. I'm regularly observing problems where the notebook UI doesn't reflect these draws, and the developer tools complain about invalid JSON.
Steps to reproduce and debug:
JSON.parse
error.buf.json
above and the intercepted web socket traffic.Expected behavior:
No errors reported, JSON syntax fine, draw commands get executed.
Actual error: Exception raised in the code below:
https://github.com/martinRenou/ipycanvas/blob/d9ee0a16cf0011978b5a5eedb165dbca564eb8c4/src/widget.ts#L155-L157
On different snippets of code I've seen other exceptions as well, e.g. indicating that the parser expected
]
or,
instead of a digit.Further investigation:
The content I constructed from within Python and the content I captured on the web socket both have equal length. But the content doesn't match. At first I had assumed a reordering of content, but on closer inspection it looks like some content gets sent twice, while other doesn't get sent at all.
For my one test run the first line of difference was this:
You can see how the web socket data may be valid JSON at this point but is already violating the nesting structure of the original content. The grep below picks 3 specific numbers occurring around the point cited above. It shows how the original data contains each number exactly once, but the web socket capture has multiple repeats of these, in different constellations in places that break the structure.
My environment:
I'm observing this from a virtualenv with the following packages installed (according to their
*.dist-info
directories):anyio-3.6.2 argon2_cffi-21.3.0 argon2_cffi_bindings-21.2.0 arrow-1.2.3 asttokens-2.2.1 attrs-22.2.0 backcall-0.2.0 beautifulsoup4-4.11.2 bleach-6.0.0 cffi-1.15.1 comm-0.1.2 debugpy-1.6.6 decorator-5.1.1 defusedxml-0.7.1 executing-1.2.0 fastjsonschema-2.16.2 fqdn-1.5.1 idna-3.4 ipycanvas-0.13.1 ipykernel-6.21.2 ipython-8.10.0 ipython_genutils-0.2.0 ipywidgets-8.0.4 isoduration-20.11.0 jedi-0.18.2 Jinja2-3.1.2 jsonpointer-2.3 jsonschema-4.17.3 jupyter_client-8.0.2 jupyter_core-5.2.0 jupyter_events-0.6.3 jupyterlab_pygments-0.2.2 jupyterlab_widgets-3.0.5 jupyter_server-2.2.1 jupyter_server_terminals-0.4.4 MarkupSafe-2.1.2 matplotlib_inline-0.1.6 mistune-2.0.5 nbclassic-0.5.1 nbclient-0.7.2 nbconvert-7.2.9 nbformat-5.7.3 nest_asyncio-1.5.6 notebook-6.5.2 notebook_shim-0.2.2 numpy-1.24.2 packaging-23.0 pandocfilters-1.5.0 parso-0.8.3 pexpect-4.8.0 pickleshare-0.7.5 Pillow-9.4.0 pip-23.0 platformdirs-3.0.0 prometheus_client-0.16.0 prompt_toolkit-3.0.36 psutil-5.9.4 ptyprocess-0.7.0 pure_eval-0.2.2 pycparser-2.21 Pygments-2.14.0 pyrsistent-0.19.3 python_dateutil-2.8.2 python_json_logger-2.0.6 PyYAML-6.0 pyzmq-25.0.0 rfc3339_validator-0.1.4 rfc3986_validator-0.1.1 Send2Trash-1.8.0 setuptools-66.1.1 six-1.16.0 sniffio-1.3.0 soupsieve-2.4 stack_data-0.6.2 terminado-0.17.1 tinycss2-1.2.1 tornado-6.2 traitlets-5.9.0 uri_template-1.2.0 wcwidth-0.2.6 webcolors-1.12 webencodings-0.5.1 websocket_client-1.5.1 wheel-0.38.4 widgetsnbextension-4.0.5
I don't have orjson installed, and
ipycanvas.utils.ORJSON_AVAILABLE
reportsFalse
in my notebook. I have highlighted some of the components that sound like they are involved in the machinery for sending this content, but I might be wrong.I would assume a bug in some underlying communication library, but at the moment I don't have a sufficient understanding of the involved components to judge where that error might lie. So I'm reporting the issue first at the point where I observe it, and hope that someone can help me construct appropriate reproducing examples on lower levels so that the suitable spin-off issues can be created.