BharatSahAIyak / autotune

A comprehensive toolkit for seamless data generation and fine-tuning of NLP models, all conveniently packed into a single block.
MIT License
9 stars 5 forks source link

Being able to stop a training/workflow on Autotrain #131

Open Gautam-Rajeev opened 4 months ago

sooraj1002 commented 3 months ago

Gevent pool does not support killing tasks

[2024-07-31 03:41:44,558: ERROR/MainProcess] pidbox command error: NotImplementedError("<class 'celery.concurrency.gevent.TaskPool'> does not implement kill_job")
Traceback (most recent call last):
  File "/home/shady/Desktop/Work/autotune/venv/lib64/python3.10/site-packages/kombu/pidbox.py", line 102, in dispatch
    reply = handle(method, arguments)
  File "/home/shady/Desktop/Work/autotune/venv/lib64/python3.10/site-packages/kombu/pidbox.py", line 124, in handle_cast
    return self.handle(method, arguments)
  File "/home/shady/Desktop/Work/autotune/venv/lib64/python3.10/site-packages/kombu/pidbox.py", line 118, in handle
    return self.handlers[method](self.state, **arguments)
  File "/home/shady/Desktop/Work/autotune/venv/lib64/python3.10/site-packages/celery/worker/control.py", line 149, in revoke
    task_ids = _revoke(state, task_ids, terminate, signal, **kwargs)
  File "/home/shady/Desktop/Work/autotune/venv/lib64/python3.10/site-packages/celery/worker/control.py", line 224, in _revoke
    request.terminate(state.consumer.pool, signal=signum)
  File "/home/shady/Desktop/Work/autotune/venv/lib64/python3.10/site-packages/celery/worker/request.py", line 416, in terminate
    pool.terminate_job(self.worker_pid, signal)
  File "/home/shady/Desktop/Work/autotune/venv/lib64/python3.10/site-packages/celery/concurrency/base.py", line 113, in terminate_job
    raise NotImplementedError(
NotImplementedError: <class 'celery.concurrency.gevent.TaskPool'> does not implement kill_job
sooraj1002 commented 3 months ago

https://github.com/celery/celery/issues/8687

KDwevedi commented 3 months ago

Implementation:

Gautam-Rajeev commented 3 months ago

@KDwevedi let me know if this can be picked next sprint . thoda required to help easy testing