Closed jules32 closed 2 years ago
I haven't been able to repro anything terribly unexpected here. I've downloaded these notebooks and included this _quarto.yml
file:
project:
type: website
website:
navbar:
title: "Test Notebooks"
left:
- text: NB 1
href: GESDISC_MERRA2_tavg1_2d_flx_Nx__Kerchunk.ipynb
- text: NB 2
href: LPDAAC_ECOSTRESS_LSTE__Kerchunk.ipynb
- text: NB 3
href: PODAAC_ECCO_SSH__Kerchunk.ipynb
On my laptop it takes about 9 seconds to render the site. Depending on how fast your server machine is though this could blow up quite a bit (5x slower?). In terms of preview, assuming it's fully rendered the site comes up in a second or two and each page takes about 3 seconds to render when clicked on in the browser (of course this could be considerably slower if your server is slower).
So it does seem like on a slower server machine .ipynb files of this size could be slow to preview (but definitely wouldn't be slow to serve to end users). These are ~ 10mb notebooks so this might be close to as good as it gets, but I'll do some more digging to see what's accounting for the time and whether we can do anything better.
Okay, in my benchmarking it only takes ~50ms to read the 10mb ipynb so the size itself is not a problem. My guess is that there is something related to regular expressions running over huge chunks of html -- hopefully this is something we can find a clean workaround for and make this go a lot faster. More soon.
I have made a couple of changes that will help some aspects of preview performance for large notebooks: https://github.com/quarto-dev/quarto-cli/commit/97f52c79f5a6e6861a4d45f4a1667b3a9a7013a9
The fact that ~ 10mb and larger notebooks take a while to render (~ 3 seconds on my laptop) isn't something I can see easy ways to improve. This is mostly because there are many passes over the content made (by pandoc and by quarto) and that volume just ends up taking more time. So rendering and initially previewing these notebooks will be slow.
In quarto preview we attempt to avoid re-rendering when we can. This was formerly done by checking the content hash of the input and output but I noticed that for large notebooks just building the hash could take 1.5 seconds! For notebooks we now use file modification times instead. Very slightly less robust but much faster.
We also track which notebooks we've already rendered -- we weren't however tracking notebooks rendered on startup (resulting in the potential for two renders, one during the initial pass and one when serving the preview). We now track the initial render.
Net of this is that you will always have to render each .ipynb at least one time before seeing a preview of it, and these renders can in fact be quite slow depending on the size of the ipynb and the speed of the machine. However, once rendered (and assuming the underlying ipynb doesn't change) the speed of preview should be nearly instant. For example, if you do this to start the preview:
quarto preview --render all
Then once the web browser opens up you'll get extremely fast renders of each page (this definitely wan't the case prior to the changes I made).
Just to clarify, I'm not suggesting that you preview with --render all
(that was just for illustration). You can continue to preview exactly as you do now and .ipynb files should get rendered exactly once for each time they are changed.
Hi @jjallaire this is super helpful, thank you for exploring this and the improvements and workflow. We will try it out in JupyterHub and share any updates :)
Following up to say this is much faster for us now, thank you so much! Closing this issue :)
Hi!
We've been having issues with .ipynb files that have Holoviews/Bokeh - these often take a long time to
quarto preview
in our JupyterHub, and also with our GitHub Action. We've encountered this before and likely chatted about it but wanted to revisit –Here are a few example notebooks where we are seeing this (included in _quarto.yml#L52-L58):
quarto preview
was taking so longA few more details about it in this earthdata-cloud-cookbook issue, where @betolink says:
Thanks in advance for any help!