jupyter / nbconvert

Jupyter Notebook Conversion
https://nbconvert.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
1.7k stars 563 forks source link

Async / asynchronous execution #1092

Open maartenbreddels opened 5 years ago

maartenbreddels commented 5 years ago

In voila, we run into a performance problem because nbconvert blocks during the execution of cells https://github.com/QuantStack/voila/issues/363. This execution is handled in the tornado handler, causing all other trafic to stop/block.

Are there plans to have nbconvert use the tornado/zmq/ioloop and have a async way of generating cells results?

MSeal commented 5 years ago

Do you mean for the ExecutePreprocessor or any template conversion in general?

From https://github.com/jupyter/nbconvert/issues/1045 we were planning on pulling the execute preprocessor code out of nbconvert, which would be a good opportunity to make it async aware. For other conversions async would probably also be possible in nbconvert 6.0 as we'll be jumping to python 3 only and able to make more dramatic changes in that process.

bollwyvl commented 5 years ago

Seems like using tornado's run_on_executor or a pool, as on nbviewer, might be appropriate for the voila case.

Having said that, async nbconvert is a whacking good idea, and just expect everything in the pipeline to potentially be async... while still providing the sync API for a while.

On Fri, Aug 23, 2019, 12:50 Matthew Seal notifications@github.com wrote:

Do you mean for the ExecutePreprocessor or any template conversion in general?

From #1045 https://github.com/jupyter/nbconvert/issues/1045 we were planning on pulling the execute preprocessor code out of nbconvert, which would be a good opportunity to make it async aware. For other conversions async would probably also be possible in nbconvert 6.0 as we'll be jumping to python 3 only and able to make more dramatic changes in that process.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jupyter/nbconvert/issues/1092?email_source=notifications&email_token=AAALCRH55DYDTWYUHUXNUNTQGAILRA5CNFSM4IPACJEKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5AX37I#issuecomment-524385789, or mute the thread https://github.com/notifications/unsubscribe-auth/AAALCRF6TS7T447EXMQPIODQGAILRANCNFSM4IPACJEA .

kevin-bates commented 4 years ago

Interesting. FWIW - @takluyver asked me to look into getting ExecutePreprocessor (relative to #809) working in a synchronous manner given that we decided to make the new kernel management package only async (see this comment).

But, given this conversation, would it be more worthwhile to see how this behaves in an async manner anyway (also relative to work done in https://github.com/takluyver/jupyter_kernel_mgmt/pull/23)?

Not sure how this effort, in either case, would be affected by #1045 but I think some good data points can be learned in any event.

MSeal commented 4 years ago

@mpacer had the most interest in spearheading a separate execute package that nbconvert and papermill both imported. I believe she also wanted to support async and synchronous use-cases in this change.

maartenbreddels commented 4 years ago

great. I've been doing some experimentation with using asyncio with voila, and it seems to be working great (tornado is much more responsive). I requires changes/features in jupyter_client, and I think we'd want to not develop this inside of nbconvert. (No PR's open yet)

mpacer commented 4 years ago

I'm definitely interested in this, and appreciate folks bringing up this exciting topic!

I am not sure how many cycles I will have free to spearhead it in a reasonable time-frame.

@maartenbreddels I realize you don't have any PRs in nbconvert or jupyter_client, but is there somewhere where we all could look at the code to more fully grok how you're using asyncio with voila?

mpacer commented 4 years ago

Seems like using tornado's run_on_executor or a pool, as on nbviewer, might be appropriate for the voila case.

I think that might be a decent stop-gap measure, @bollwyvl could you link to the code on nbviewer that could give folks a starting place for how to implement this for voila?

maartenbreddels commented 4 years ago

Yes, I plan to open a WIP PR for that on Friday, I have to clean up the code a bit, it's a lot of copy/paste + adding async and await at many places. But I'd rather have a bad PR open to start experimenting/discussing.

maartenbreddels commented 4 years ago

Seems like using tornado's run_on_executor or a pool, as on nbviewer, might be appropriate for the voila case.

I think that might be a decent stop-gap measure, @bollwyvl could you link to the code on nbviewer that could give folks a starting place for how to implement this for voila?

@jtpio had a nice PR on that: https://github.com/QuantStack/voila/pull/364/files

But this solution will not work for us since we want to execute cell by cell (async) to allow progressive rendering. What is done in that PR will return the execution after it's done, not after each cell execution.

maartenbreddels commented 4 years ago

A POC for an async jupyter client is https://github.com/jupyter/jupyter_client/pull/471 And voila is using it here: https://github.com/QuantStack/voila/pull/374