allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
https://clear.ml/docs
Apache License 2.0
5.61k stars 651 forks source link

Do not include the entire caller script into a pipeline-decorated function's script #1040

Open Make42 opened 1 year ago

Make42 commented 1 year ago

Each component (e.g., decorated function) will end up packaged in a separate script file, and that script file will be run on the remote machine. The entire script (from a component) gets a main. This main gets the same arguments as the component-function. The script's argparse object get's directly manipulated when a respective change to its the component-function's argument, e.g., during an hyperparameter optimization, is made. The script is put into a task. When going into the component's task in the Web UI, one can find the script in "Execution" -> "Uncommitted changes".

A component is called by the pipeline. The pipeline is run mostly like a task; however, a pipeline's script is not only created from the pipeline-function, but includes everything in the script that is calling the pipeline.

This is seems inconsistent when compared to how components are handled, and seems to have historical reasons. Maybe a user wanted this and it was not considered that this could be achieved by simply putting everything into the pipeline-function's code - of course I do not know. Maybe there is a very good reason, that I am not aware of - in that case, I am in fact interested in the use-case :-).

I am suggesting to make this consistent and only include the code that is in the pipeline-decorated function into the pipeline's script. This would allow for some more controlled software engineering. For example, one could have a pipeline definition and an HPO of this pipeline (which actually makes quite a bit of sense) in the same script without any hacking. To ensure backwards compatibility, a respective parameter in the pipeline-decorator could say whether the old way of handling should be done instead of the new one.