wooey / Wooey

A Django app that creates automatic web UIs for Python scripts.
http://wooey.readthedocs.org
BSD 3-Clause "New" or "Revised" License
2.12k stars 182 forks source link

Wooey and while True #198

Open toert opened 6 years ago

toert commented 6 years ago

If I start an executing a script with "while True" and after some time I decide to stop it via Stop button then it will not stop and will work infinitely. My solution of the problem is killing all Celery workers as processes. Do you have any ideas how can I stop infinitely scripts via web interface, not CLI?

Chris7 commented 6 years ago

Does your script have a blind try/except clause? If so, it may be swallowing the exception celery sends to stop a task.

toert commented 6 years ago

It actually doesn't have any blind try/except clauses

Chris7 commented 6 years ago

Can you provide the script? I just tested and celery successfully terminated:

[2018-01-01 23:24:57,945: INFO/MainProcess] Terminating 90cbb3e6-7026-45c9-b800-8e2dad93f2f1 (Signals.SIGKILL)
[2018-01-01 23:24:57,976: ERROR/MainProcess] Task wooey.tasks.submit_script[90cbb3e6-7026-45c9-b800-8e2dad93f2f1] raised unexpected: Terminated(9,)
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/billiard/pool.py", line 1678, in _set_terminated
    raise Terminated(-(signum or 0))
billiard.exceptions.Terminated: 9

This is the script I tested with:

import argparse
import sys

parser = argparse.ArgumentParser(description="run forever!")

def main():
    while True:
        import time
        time.sleep(5)

if __name__ == "__main__":
    parser.parse_args()
    sys.exit(main())
toert commented 6 years ago
import sys
import time
import argparse
import logging

import requests  # pip install requests

logging.basicConfig(level=logging.INFO)

parser = argparse.ArgumentParser()
parser.add_argument('--m', type=int)

args = parser.parse_args()

def main():
    print('Hello')
    logging.info('start')

    while True:
        print('Hello True')
        logging.info('i am still running')
        requests.get('http://127.0.0.1:8000')
        time.sleep(10)

if __name__ == "__main__":
    sys.exit(main())

And it sends requests even after clicking Stop button.

How I run celery:

#!/bin/sh
cd /opt
source venv/bin/activate
cd DO_wooey
python manage.py celery worker -c 5 --beat -l info
Chris7 commented 6 years ago

Sorry it's taken me a bit to get to this. I just tried your script and the stop button worked.

celery_1  | [2018-01-22 15:29:53,680: INFO/MainProcess] Received task: wooey.tasks.submit_script[df427903-e036-44b9-84bd-192c4a670ed0]
celery_1  | [2018-01-22 15:30:07,053: INFO/MainProcess] Terminating df427903-e036-44b9-84bd-192c4a670ed0 (Signals.SIGKILL)
celery_1  | [2018-01-22 15:30:07,086: ERROR/MainProcess] Task wooey.tasks.submit_script[df427903-e036-44b9-84bd-192c4a670ed0] raised unexpected: Terminated(9,)
celery_1  | Traceback (most recent call last):
celery_1  |   File "/usr/local/lib/python3.6/site-packages/billiard/pool.py", line 1678, in _set_terminated
celery_1  |     raise Terminated(-(signum or 0))
celery_1  | billiard.exceptions.Terminated: 9

I would look at your installed dependencies.

Chris7 commented 6 years ago

What version of python are you using and can you provide the output of pip freeze?

toert commented 6 years ago

Dependencies https://github.com/toert/currencies_bot/blob/master/requirements.txt

Chris7 commented 6 years ago

I think I know the reason -- the default broker is SQL, which is useful for development/testing. However, the broadcast/control commands are not supported by this broker. When you run python manage.py celery inspect active, do you receive:Error: Broadcast not supported by SQL broker transport?

To fix this, you need to define a "real" broker like rabbit in your user_settings (BROKER_URL).

toert commented 6 years ago

No, I use real broker. (venv) toerting@ubuntu-1gb-lon1-01:/opt/DO_wooey$ python manage.py celery inspect active -> celery@ubuntu-1gb-lon1-01: OK

Chris7 commented 6 years ago

Ok, to debug this I'll need a step by step to reproduce on my end from a clean setup.

toert commented 6 years ago

I inspected Wooey's source code and found that jobs' processes are killed by SIGKILL -9. There is no way to block or to try to catch the signal. Also after sending SIGKILL scripts stop immediately and then rerun. The reason of it is RabbitMQ Queued messages. Wooey's stop job button do nothing with queued messages. However, if a process was stopped by internal conditions(exceptions, exitcode 0, etc) and a job status became 'Completed' then it disappears from RabbitMQ queue. Implementing 'Completed' status via Django admin doesn't delete message. Is that any way to purge queue after clicking stop job button?

Chris7 commented 6 years ago

This is tricky. The stop behavior in celery is this:

When a worker receives a revoke request it will skip executing the task, but it won’t terminate an already executing task unless the terminate option is set. If terminate is set the worker child process processing the task will be terminated. The default signal sent is TERM, but you can specify this using the signal argument. Signal can be the uppercase name of any signal defined in the signal module in the Python Standard Library.

There seems to be no good way to stop a stuck process that doesn't either nuke the entire worker or risk not actually working (a process can ignore SIGHUPs). I think a better solution would be to have a STOPPING state after a SIGHUP is sent, and then if a task is in STOPPING, have the Stop button change to Kill which will terminate the task/process.

Also after sending SIGKILL scripts stop immediately and then rerun.

I think the reason it is rerunning is because you have ACKS_LATE set to True. This means that a task is only taken off the queue after it is successful. One option is to disable ACKS_LATE and use the rerun command instead to selectively requeue work.

Is that any way to purge queue after clicking stop job button? You can purge messages through celery (look at celery purge) or through rabbitmq's management page.

Chris7 commented 6 years ago

@toert I take it this means you are able stop scripts now?

toert commented 6 years ago

Actually not. Take a look at https://github.com/toert/DO_wooey . As you can see ACKS_LATE isn't defined by me, also a default value is False. And celery purge looks good, however I don't want to purge it manually every time 😀

Chris7 commented 6 years ago

What OS are you using? I setup a Wooey server using that repository in python 3.6.5 and halting scripts worked as expected.

Also, you might want to upgrade the version of Wooey you are using to at least the latest in 0.9.x (if not 0.10.x, though 0.10.x has a few changes wrt celery that will require updating some of your settings)