How to properly set trials for the comet optimizer?

eiphy commented 2 years ago

Before Asking:

[x] I have searched the Issue Tracker.
[x] I have searched the Documentation.

What is your question related to?

[x] Comet Python SDK
[ ] Comet UI
[ ] Third Party Integrations (Huggingface, TensorboardX, Pytorch Lightning etc.)

What is your question?

I want to know how to properly set the trials parameters for the comet optimizer. My current attempts resulting in weird behaviour. The random is unchanged within each parameter set. The bayes optimizer only search for trials times (This is related to this issue)

In addition, may I know if the metrics is automatically averaged among all trials? Or I should set the metric to the averaged metric.

Code

op_cfg = {
    # We pick the Bayes algorithm:
    "algorithm": "bayes",
    # Declare your hyperparameters in the Vizier-inspired format:
    "parameters": {
        "x": {"type": "integer", "min": 1, "max": 5},
        "seed": {"type": "integer", "scalingType": "linear", "min": 1, "max": 5},
    },
    # Declare what we will be optimizing, and how:
    "spec": {"metric": "loss", "objective": "minimize", "maxCombo": 50},
    "trials": 5,
}

op = comet_ml.Optimizer(op_cfg)
print(json.dumps(op.status(), indent=4))

for exp in op.get_experiments(display_summary_level=0):
    loss = random.random() + exp.get_parameter("x")
    exp.log_metric("loss", loss)
    print("x", exp.get_parameter("x"))
    print("seed", exp.get_parameter("seed"))

What have you tried?

The output of the above code:

COMET INFO: COMET_OPTIMIZER_ID=21ded5540468483699f0821c452a59a6
COMET INFO: Using optimizer config: {'algorithm': 'bayes', 'configSpaceSize': 'infinite', 'endTime': None, 'id': '21ded5540468483699f0821c452a59a6', 'lastUpdateTime': None, 'maxCombo': 50, 'name': '21ded5540468483699f0821c452a59a6', 'parameters': {'seed': {'max': 5, 'min': 1, 'scalingType': 'linear', 'type': 'integer'}, 'x': {'max': 5, 'min': 1, 'scalingType': 'uniform', 'type': 'integer'}}, 'predictor': None, 'spec': {'gridSize': 10, 'maxCombo': 50, 'metric': 'loss', 'minSampleSize': 100, 'objective': 'minimize', 'retryAssignLimit': 0, 'retryLimit': 1000}, 'startTime': 27974916145, 'state': {'mode': None, 'seed': None, 'sequence': [], 'sequence_i': 0, 'sequence_pid': None, 'sequence_retry': 0, 'sequence_retry_count': 0}, 'status': 'running', 'suggestion_count': 0, 'trials': 5, 'version': '2.0.1'}
{
    "algorithm": "bayes",
    "configSpaceSize": "infinite",
    "endTime": null,
    "id": "21ded5540468483699f0821c452a59a6",
    "lastUpdateTime": null,
    "maxCombo": 50,
    "name": "21ded5540468483699f0821c452a59a6",
    "parameters": {
        "seed": {
            "max": 5,
            "min": 1,
            "scalingType": "linear",
            "type": "integer"
        },
        "x": {
            "max": 5,
            "min": 1,
            "scalingType": "uniform",
            "type": "integer"
        }
    },
    "predictor": null,
    "spec": {
        "gridSize": 10,
        "maxCombo": 50,
        "metric": "loss",
        "minSampleSize": 100,
        "objective": "minimize",
        "retryAssignLimit": 0,
        "retryLimit": 1000
    },
    "startTime": 27974916145,
    "state": {
        "mode": null,
        "seed": null,
        "sequence": [],
        "sequence_i": 0,
        "sequence_pid": null,
        "sequence_retry": 0,
        "sequence_retry_count": 0
    },
    "status": "running",
    "suggestion_count": 0,
    "trials": 5,
    "version": "2.0.1"
}
COMET INFO: Experiment is live on comet.ml 
x 3
seed 2
COMET INFO: Experiment is live on comet.ml 

x 3
seed 2
COMET INFO: Experiment is live on comet.ml 

x 3
seed 2
COMET INFO: Experiment is live on comet.ml 

x 3
seed 2
COMET INFO: Experiment is live on comet.ml 

x 3
seed 2
COMET INFO: Optimizer search 21ded5540468483699f0821c452a59a6 has completed
COMET INFO: Uploading metrics, params, and assets to Comet before program termination (may take several seconds)
COMET INFO: The Python SDK has 3600 seconds to finish before aborting...
COMET INFO: Uploading 1 metrics, params and output messages

By setting the algorithm to "grid", the optimizer can search multiple parameter sets. However, the random seed still does not change within each parameter set.

dsblank commented 2 years ago

@eiphy Yes, when using the "bayes" algorithm, it uses the evidence from the logged metric to determine next possible combinations. But if bayes determines it has already given all of the best guesses, then it can't generate any additional combos.

I tried "grid" with your config, and indeed get different seed values. Here is another way to test the optimizer, without having to create experiments:

from comet_ml import Optimizer

op_cfg = {
    "algorithm": "grid",
    "parameters": {
        "x": {"type": "integer", "min": 1, "max": 5},
        "seed": {"type": "integer", "scalingType": "linear", "min": 1, "max": 5},
    },
    "spec": {"metric": "loss", "objective": "minimize", "maxCombo": 50},
    "trials": 5,
}

op = Optimizer(op_cfg)

for params in op.get_parameters():
    print(params)

That gives 100 combinations, 4 each for x [1, 5), 4 each for seed [1, 5), times 5 trials each (4 4 5) = 100.

eiphy commented 2 years ago

@dsblank Thank you for the reply! The code is quite useful for debugging.

I'm not clear why the bayes algorithm determines the first parameter set is the best guess, as it literately tried only one combination. In addition, when I set the trials to 1, it runs correctly and gives 20 combinations.

It is actually 4 5 5. It seems that seeds [1, 5) and x [1, 5]. I wonder why they have different ranges.

The random seed still does not change within the same parameter set. I think each trial should have different random seed: trial 1: x=2; seed=1 trial 2: x=2; seed=2 trial 3: x=2; seed=3 ...

Current behaviour is like: trial 1: x=2; seed=1 trial 2: x=2; seed=1 trial 3: x=2; seed=1 ... trial 1: x=4; seed=2 trial 2: x=4; seed=2 trial 3: x=4; seed=2 ...

dsblank commented 2 years ago

Hmmm... interesting! Yes, I think you have uncovered a couple of issues. We'll add those to our list and get back to you. Thank you for the detailed analysis!

dsblank commented 2 years ago

FYI, these issues are being tracked as CM-1900.

dsblank commented 1 year ago

Moved to internal tracking issue EXT-1340.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 1 year ago

This issue was closed because it has been stalled for 5 days with no activity.

comet-ml / issue-tracking