allegroai / clearml

ClearML - Auto-Magical CI/CD to streamline your AI workload. Experiment Management, Data Management, Pipeline, Orchestration, Scheduling & Serving in one MLOps/LLMOps solution
https://clear.ml/docs
Apache License 2.0
5.69k stars 655 forks source link

Bug: discrepancy in type causes destructive, automated casting without warning in HPO #1025

Closed Make42 closed 1 year ago

Make42 commented 1 year ago

When I am mixing boolean hyperparameters with string hyperparameters, then all the string hyperparameter in a hyperparameter optimization get turned into the boolean False.

@PipelineDecorator.pipeline(...)
def executing_pipeline(method__training):
    ...

# base task
pipeline_result = executing_pipeline(
    method__training={'parB': False, 'parA': False}
)

HyperParameterOptimizer(
        ...
        hyper_parameters=[
            DiscreteParameterRange('Args/method__training/parA', values=[False, 'perLevel', 'perParentNode']),
            DiscreteParameterRange('Args/method__training/parB', values=[False , True]),
        ], ...)

I am not sure if that is the case, because they get ignored and the original False from the base task is used or if that is the case because they get turned into False. I am also not sure if the mixing in DiscreteParameterRange is the problem or if the problem is that the base task has used a different type.

[... one hour later ... after trying out more stuff ...]

When I use

@PipelineDecorator.pipeline(...)
def executing_pipeline(method__training):
    ...

# base task
pipeline_result = executing_pipeline(
    method__training={'parB': False, 'parA': 'perLevel'}  # HERE IS THE DIFFERENCE !
)

HyperParameterOptimizer(
        ...
        hyper_parameters=[
            DiscreteParameterRange('Args/method__training/parA', values=[False]),  # THE FALSE HERE GETS CASTED
            DiscreteParameterRange('Args/method__training/parB', values=[False , True]),
        ], ...)

instead, I see that during the HPO, the boolean False is casted by ClearML into a string. Since the DiscreteParameterRange contains only one type, namely bookean, the casting must happen at least because of the discrepancy between the type of the HPO and the type of the base task. Quite possibly, a discrepancy in type within DiscreteParameterRangecould also cause problems.

Make42 commented 1 year ago

Arguments are always stored as strings. This must also considered if floats are stored, because precision might get lost. Additionally, the type of the respective argument is stored with the base task. This means, that if the arguments are manipulated later, for example by an HPO, then the new argument value is also casted into a string, but the type into which the string is casted back, is not determined by the new value's type, but by the original base task's value type. If the casting is not possible (for example, the original type was a boolean, but the new argument value is "my_string") then we get an exception. If the casting is possible (for example, the original type was a string, but the new argument value is False), still some unintended consequences might be the case. If we want casting behaviour that is different, we should programm the intended casting explicitly into the component or pipline.