dask / dask-labextension

JupyterLab extension for Dask
BSD 3-Clause "New" or "Revised" License
311 stars 62 forks source link

Don't check the root bokeh URL #188

Closed ian-r-rose closed 3 years ago

ian-r-rose commented 3 years ago

Fixes #185. Checking the root of the dask dashboard triggers a lot of bokeh document creation, which leaks memory. As a workaround, only sniff the known JSON endpoint.

This should be fairly safe, but I'd like to try this on a pangeo JupyterHub to ensure it doesn't break things.

Also fixes #180 while I'm at it

jrbourbeau commented 3 years ago

cc'ing @bryevdv for awareness about the bokeh document creation leaking memory

bryevdv commented 3 years ago

@jrbourbeau Bokeh checks carefully that the module that is created for each session/document only has the expected number of referrers whenever a Document is destroyed. But it's definitely possible for user app code to hold references to things inside the module that outlive sessions/documents, e.g. by creating threads in the app code that do not get cleaned up. So I guess I'd suggest starting with looking there. I am not ruling out an issue on the Bokeh side of things, but as always, we need an issue with a minimal reproducer to investigate.

As an aside, I had proposed adding an explicit supported pre-flight check option in https://github.com/bokeh/bokeh/issues/8884 but it was closed for seeming lack of interest. If it seems useful after all, please feel free to comment there.

ian-r-rose commented 3 years ago

Thanks for the references @bryevdv. I was able to reproduce this issue more minimally by polling the dask scheduler dashboard, but not by polling one of the bokeh server examples, so it seems more likely that the proliferation of references is in distributed. I would probably be interested in something like the pre-flight check (or even just HEAD or OPTIONS to see if somebody is listening on the other end), for what it's worth. Or if you have other suggestions for best-practices around talking to a bokeh server I'm happy to hear them.

I'm going to merge and publish this since it can result in a pretty nasty user experience, and suggest we continue the discussion in the linked dask/distributed#4827 (where it will hopefully be less pressing).