okfn-brasil / serenata-toolbox

📦 pip module containing code shared across Serenata de Amor's projects | ** Este repositório não recebe atualizações frequentes **
MIT License
154 stars 69 forks source link

Fixing running serenata-toolbox into jupyter with tornado 5.0 #204

Open thiagoalmeidasa opened 5 years ago

thiagoalmeidasa commented 5 years ago

What is the purpose of this Pull Request? This PR aims to fix running serenata-toolbox into jupyter with tornado 5.0.

What was done to achieve this purpose?

Following the issue thread here, I changed the way to launch a task in the Downloader class.

How to test if it really works?

Try to follow this documentation step using a jupyter notebook with tornado 5.0 and face a RuntimeError: This event loop is already running exception. After that, try to install using my branch and realize that exception has disappeared.

Who can help reviewing it?

Anyone involved in the project.

cuducos commented 5 years ago

This PR aims to fix running serenata-toolbox into jupyter with tornado 5.0.

Sorry @thiagoalmeidasa, I'm afraid I missed the point here — there's too much information packed in a simple sentence.

Would you mind expanding on:

thiagoalmeidasa commented 5 years ago

Thanks for the review @cuducos .

Let me try to explain.

Would you mind expanding on:

  • What is the context and use case of running the toolbox in jupyter with tornado 5?
  • What errors are raised in this scenario?

Some days ago I just tried to run this documentation step using a fresh jupyter notebook installation and get this error: RuntimeError: This event loop is already running.

After that, I started to dig about this problem and found this issue thread, then tried to fix by reading and evaluate the suggestions proposed in the comments.

  • Will create_task work on different setups?

I'm not an expert in python development, but this documentation section indicates that create_task will work fine in different setups.

cuducos commented 5 years ago

My point is: I'm afraid that create_task would not invoke the coroutine in setups in which the loop is not already running (that is to say, outside tornado 5.0). Check this simple example I just coded:

In [1]: import asyncio

In [2]: async def magic():
   ...:     print(42)
   ...:

In [3]: loop = asyncio.get_event_loop()

In [4]: loop.create_task(magic())
Out[4]: <Task pending coro=<magic() running at <ipython-input-2-ca8ce5b12955>:1>>

And compare to this one:

In [1]: import asyncio

In [2]: async def magic():
   ...:     print(42)
   ...:

In [3]: loop = asyncio.get_event_loop()

In [4]: loop.run_until_complete(magic())
42

Maybe we need a way to know if the loop is already running and use create_task or run_until_complete accordingly… what do you think?

thiagoalmeidasa commented 5 years ago

Good point.

I'll try to improve this solution. Maybe @willianpaixao could help me to fix that.

willianpaixao commented 5 years ago

I see a couple of ways to solve this problem:

  1. You add an exception that is thrown in case an event loop already exists. The try/catch block should create a new thread-safe event loop. [1] [2]
  2. Before creating a task, check whether there is an existing event loop with asyncio.get_running_loop() and if so, append your task to it, otherwise, create a new event loop as usual with asyncio.new_event_loop() or asyncio.get_event_loop(). [3]
cuducos commented 5 years ago

As get_running_loop can only be called from a coroutine or a callbackl, I think @willianpaixao's first options will result in a more readable code ; )

thiagoalmeidasa commented 5 years ago

And what about that approach?

import asyncio

async def magic():
    print(42)

loop = asyncio.get_event_loop()

if loop.is_running():
    loop.create_task(magic())
else:
    loop.run_until_complete(magic())

It worked fine when using or not jupyter notebooks.

cuducos commented 5 years ago

It looks great : )

thiagoalmeidasa commented 5 years ago

@cuducos what does this test do?

self.assertTrue(loop.create_task.called)

There is a reason to assure which method the download function is using?

cuducos commented 5 years ago

I think we need two tests.

  1. Mock the loop.is_running() so it returns True and we make two assertions: loop.run_until.complete should not be called and loop.create_task should be called
  2. Mock the loop.is_running() so it returns False and we make two assertions: loop.run_until.complete should be called and loop.create_task should not be called