allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
https://clear.ml/docs
Apache License 2.0
5.61k stars 651 forks source link

Could not find queue named "services" using pipelines #1109

Open abelBEDOYA opened 1 year ago

abelBEDOYA commented 1 year ago

Describe the bug

I am trying to write a .py script in which a clearml pipelin is defined using python decorators. I would like to run it remotely using the clearML queue system. To do so, I've added: PipelineDecorator.set_default_execution_queue('queue1') and this error shows up:

  File "/home/faraujo/Desktop/clearML_API_pruebas/pipelines/tarea.py", line 52, in <module>
    run_pipeline(project='prueba-pipeline',task_name = 'progresion_aritmetica')
  File "/home/faraujo/anaconda3/lib/python3.9/site-packages/clearml/automation/controller.py", line 2935, in internal_decorator
    a_pipeline._task.execute_remotely(queue_name=pipeline_execution_queue)
  File "/home/faraujo/anaconda3/lib/python3.9/site-packages/clearml/task.py", line 2172, in execute_remotely
    Task.enqueue(task, queue_name=queue_name)
  File "/home/faraujo/anaconda3/lib/python3.9/site-packages/clearml/task.py", line 1121, in enqueue
    raise ValueError('Could not find queue named "{}"'.format(queue_name))
ValueError: Could not find queue named "services"

I have made sure there is a queue1. Besides, I have not writen "services" as a queue.

To reproduce

if __name__=='__main__':
    # PipelineDecorator.run_locally()
    PipelineDecorator.set_default_execution_queue('queue1')
    run_pipeline(project='prueba-pipeline',task_name = 'progresion_aritmetica')

Expected behaviour

It should run the pipeline remotely using queue1.

ainoam commented 1 year ago

@abelBEDOYA The 'services' queues is the default value for running the pipeline controller itself. When using set_default_execution_queue(), you are setting the queue on which the pipeline steps will run. To control where the controller itself will run, use the pipeline_execution_queue parameter in PipelineDecorator.pipeline.

Does this help?

abelBEDOYA commented 1 year ago

@ainoam Thanks for your answer! Providing pipeline_execution_queue argument to PipelineDecorator.pipeline was the key point. However, now I'm encountering an issue where the pipeline is stuck in a pending state and never starts running. I've taken some screenshots for reference.

Screenshot from 2023-09-01 13-12-02

Screenshot from 2023-09-01 13-12-16

It is pending, but it has been placed in a queue with a worker, so it should start running.

jkhenning commented 1 year ago

@abelBEDOYA what is the queue with the worker?

abelBEDOYA commented 1 year ago

@jkhenning the queue with the worker is colaVolumen. It shows a new experiment in Next Experiment section but it doesn't start running.

jkhenning commented 1 year ago

OK, can you share the console output or log of the agent listening to this queue?

abelBEDOYA commented 1 year ago

@jkhenning I've created an agent by running: clearml-agent daemon --queue colaVolumen --docker It works properly for normal task but pipelines always turn into pending tasks. How can I show the console output logs of the agent? I haven't found any command in the official website. Thanks for the help.