Closed seanturner026 closed 2 months ago
Using jupyterlab 4.2.4 does not really improve performance
Launching a second notebook server to a Node with this AMI will load the extension instantly (e.g. the first notebook server will take 50 seconds, the second will take .5 seconds)
My guess is that this has very little to do with the dask-labextension project, but cc'ing @jacobtomlinson who can maybe say so with more authority
I'd be curious to know what happens if you stop the container and start it again. Is this some kind of first-run penalty for extensions, or does it happen every load?
I'd also recommend that you head over to the Jupyter forum and ask there as they will be able to give far better guidance than we can. If they suggest it might be extension related and theres something we can do to help the feel free to report back here and we can see what can be done.
I'd be curious to know what happens if you stop the container and start it again. Is this some kind of first-run penalty for extensions, or does it happen every load?
Big time first time penalty. Launching a second notebook server to the same Node will load the extension in .2 seconds or something. I have also noticed that the Node takes twice as long to come online as the regular karpenter AMIs (that weren't image built by us). Likely related to the performance issues we're seeing.
Appreciate the feedback, and I have actually opened up a thread on the jupyter forum already. Will go ahead and close this as this is likely a symptom of the underlying issue rather than the problem itself (but feel free to respond if anything comes to mind :) ).
Describe the issue:
I'm trying to optimize Jupyterhub launch speeds by ensuring that some form of our large Data Science Image is always available on Nodes new or Old.
I'm using AWS EC2 Image Builder to produce an AMI that has our large Data Science image baked in. This is done using
containerd
(ctr
) to pull the Image to thek8s.io
namespace that EKS uses.The Image Builder pipeline looks like this:
This AMI is then launched by Karpenter which is always deploying the newest version of the AMI whenever the Cluster needs to scale
While this reduces the time needed to pull the image (takes 300ms to 20 seconds depending on code changes), it takes almost a minute to load extensions:
Compare this to our non custom AMI pods which load the extensions in 2 seconds.
Is it the dask extension specifically?
Is there a way to lazy load the dask extension?
Is there some weirdness due to ctr? Are perhaps the layers not warmed up even though they’re present on the Node?
Minimal Complete Verifiable Example:
Anything else we need to know?:
Environment:
dask-labextension==6.1.0 jupyterhub=4.0.0 jupyterlab==3.6.3