Open olucafont6 opened 1 year ago
Hi @olucafont6 ! This seems to be an issue related to the pipeline directory structure. First, to unhide a project, you can do this:
from clearml.backend_api.session.client import APIClient
client = APIClient()
project = client.projects.get_all(name="^Chat Testing$", search_hidden=True, _allow_extra_fields_=True)[0]
system_tags = project.system_tags
system_tags.remove("hidden")
# system_tags.remove("pipeline") # might also want to remove this tag
client.projects.update(project=project.id, system_tags=system_tags)
Then, to schedule a pipeline, change the target_project
to <pipeline_project>/.pipelines/<pipeline_name>
:
target_project='Chat Testing/.pipelines/chat-testing-pipeline'
The current behaviour is not ideal. We will make some changes to the scheduler to properly handle pipelines.
@eugen-ajechiloae-clearml Awesome, thanks for the quick response!
I tried out the code you shared and it worked for un-hiding the project - after that I was able to run the pipeline fine (I didn't get the no-projects-found error).
Then, to schedule a pipeline, change the
target_project
to<pipeline_project>/.pipelines/<pipeline_name>
:target_project='Chat Testing/.pipelines/chat-testing-pipeline'
Interesting - does this mean we can schedule a pipeline without the task ID of a previous run of the pipeline? I tried messing around with the arguments, but it seemed like I could only change what project / pipeline the pipeline would run in with the target_project
parameter - e.g., this doesn't work:
task_scheduler.add_task(minute=1, target_project="Chat Testing/.pipelines/chat-testing-pipeline",
queue="pioneer", recurring=False, execute_immediately=True)
(It gives this error:)
Traceback (most recent call last):
File "/home/jovyan/work/{redacted}/{redacted}/scheduler/scheduler.py", line 5, in <module>
task_scheduler.add_task(minute=1, schedule_function=None, target_project="Chat Testing/.pipelines/chat-testing-pipeline",
File "/opt/conda/lib/python3.11/site-packages/clearml/automation/scheduler.py", line 618, in add_task
mutually_exclusive(schedule_function=schedule_function, schedule_task_id=schedule_task_id)
File "/opt/conda/lib/python3.11/site-packages/clearml/backend_interface/util.py", line 215, in mutually_exclusive
at_least_one(_exception_cls=_exception_cls, _check_none=_check_none, **kwargs)
File "/opt/conda/lib/python3.11/site-packages/clearml/backend_interface/util.py", line 208, in at_least_one
raise _exception_cls('At least one of (%s) is required' % ', '.join(kwargs.keys()))
Exception: At least one of (schedule_function, schedule_task_id) is required
Not sure if that's what you meant though.
The current behaviour is not ideal. We will make some changes to the scheduler to properly handle pipelines.
Sounds good - thanks!
@olucafont6 You need to specify a schedule_task_id
as well
@eugen-ajechiloae-clearml Okay, yeah that's more what the documentation sounded like.
I tried this out and was able to change which pipeline the cloned task / pipeline ran in. Since I wanted to just run the task in the pipeline it was normally a part of though, just leaving out target_project
worked fine.
Re my previous question about scheduling a run of the pipeline without an ID of a previous run (https://github.com/allegroai/clearml/issues/1137#issuecomment-1764935198), I was able to get this working with PipelineController.get()
:
from clearml.automation import TaskScheduler, PipelineController
chat_testing_pipeline = PipelineController.get(
pipeline_project="Chat Testing", pipeline_name="chat-testing-pipeline")
task_scheduler = TaskScheduler(
force_create_task_project="Scheduler", force_create_task_name="Scheduling Service")
task_scheduler.add_task(day=1, name="Daily Chat Testing", schedule_task_id=chat_testing_pipeline.id,
queue="pioneer", recurring=True, execute_immediately=True)
task_scheduler.start_remotely()
One of the examples makes it look like you can get this to work with Task.get_task()
, but that didn't seem to work for me:
https://github.com/allegroai/clearml/blob/8a834af777d7c4d1541573158d627c9d39f5c7c5/examples/scheduler/cron_example.py#L15-L24
Describe the bug
When I use the
TaskScheduler.add_task()
function to schedule a task, and I specify atarget_project
, I get an error from the running pipeline that it can't find the specified project, and the project becomes hidden in the UI.To reproduce
Basically I ran a Python script with this code:
(The task ID was the ID of previously run pipeline run.)
This started the task, but it failed when it got to the
add_step
case:The "Chat Testing" project then went partially missing from the UI (you can see it if you select Show Hidden Projects):
If I try to run the pipeline / task a different way, I get the same error about not being able to find the project.
It seems like the project somehow got corrupted or something, but I'm not sure how to restore it so it acts normally.
I tried doing the same thing with another project and had the same problem.
Expected behaviour
Using the
target_project
parameter ofTaskScheduler.add_task()
would place the cloned Task in the specified project, and not corrupt the project (or whatever is happening).Environment
1.11.1
1.11.0-373
3.11.4