sagemathinc / cocalc

CoCalc: Collaborative Calculation in the Cloud
https://CoCalc.com
Other
1.16k stars 216 forks source link

jupyter kernel start on a compute server -- sometimes it hangs on starting #7529

Closed williamstein closed 3 months ago

williamstein commented 5 months ago

I just saw the following on a computer server, and it's obviously a very serious bug, so will get fixed soon:

  1. Created a compute server with the Tensorflow image
  2. Immediately set a brand new Jupyter notebook to run on that compute server and selected a Python kernel.
  3. When I tried to run import tensorflow as tf the startup just sat there with nothing running. Restarting the kernel didn't help.

WORKAROUND: I set the compute server for the notebook back to "Shared Resources" for a few seconds, then back to the compute server. Then everything immediately worked properly.

image
williamstein commented 4 months ago

For what it is worth, yesterday I made a bunch of videos involving compute servers and Jupyter and didn't hit this issue once. So it's a bit hard to reproduce.

williamstein commented 3 months ago

I disabled the pool and in my testing haven't seen this since. The pool is much less necessary for compute servers. I'll re-open this if the problem appears again.