allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
https://clear.ml/docs
Apache License 2.0
5.43k stars 643 forks source link

PR: Add support to TaskScheduler() to accept a function called at runtime to return a task_id #1211

Open natephysics opened 4 months ago

natephysics commented 4 months ago

Proposal Summary

I'm working on a PR that adds new functionality to the task scheduler by adding a new parameter task_id_function to TaskScheduler.add_task() that takes a callable which has an expected return of a task_id. This task_id_function function is run at runtime (when the task scheduler would normally execute the scheduled task) and uses the task_id returned by the function + the other parameters from .add_task() as the scheduled task.

Motivation

Why is this useful: there's a host of reasons but the biggest one: it gives users much more control over the tasks that are run by the task scheduler. Currently, as far as I can tell, if I wanted to run the most recent task (at runtime) from a given project with a specific tag, it's not possible to do with the task scheduler. I can use the schedule_function parameter and create a function that finds and runs the task but then I lose the core advantages of .add_task(), no way to specify queues, task_parameters, and task_overrides. Naturally, I could wrap all of that into the function called by task_parameters but then I'm basically just writing my own scheduler at that point. This will also let you do some preprocessing before returning the task_id, for example, if you wanted to clean up old tasks.

Related Discussion

https://clearml.slack.com/archives/CTK20V944/p1708447659999379?thread_ts=1708445057.172119&cid=CTK20V944