aws / sagemaker-experiments

Experiment tracking and metric logging for Amazon SageMaker notebooks and model training.
Apache License 2.0
125 stars 36 forks source link

How to assign a Trial to every training job using HyperparameterTuner ? #94

Closed tchaton closed 3 years ago

tchaton commented 4 years ago

Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like A clear and concise description of what you want to happen.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context Add any other context or screenshots about the feature request here.

danabens commented 4 years ago

Hi Thomas, good question. I'm guessing its possible with something like this:

# list all Training Jobs created by the Hyperparameter Tuning Job
# for each Training Job
#   get the Trial Component created from the Training Job
#   associate that Trial Component with the Trial 

I'm investigating and will update this issue.

danabens commented 3 years ago

see example https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-experiments/associate-hyper-parameter-tuning-job/associate-hyperparameter-tuning-job.ipynb

morelen17 commented 3 years ago

see example https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-experiments/associate-hyper-parameter-tuning-job/associate-hyperparameter-tuning-job.ipynb

@danabens It didn't work for me, unfortunately.

However, I found out that this approach works:

from smexperiments.search_expression import Filter, Operator, SearchExpression
from smexperiments.trial import Trial
from smexperiments.trial_component import TrialComponent

trial = Trial.create(...)
tuning_job_name = '...'

...

search_expression = SearchExpression(
    filters=[
        Filter('TrialComponentName', Operator.CONTAINS, tuning_job_name),
        Filter('Source.SourceType', Operator.EQUALS, 'SageMakerTrainingJob'),
    ],
)

trial_component_search_results = TrialComponent.search(search_expression=search_expression)
for tc in trial_component_search_results:
    trial.add_trial_component(tc.trial_component_name)
    time.sleep(0.5)  # sleep to avoid throttling

Could you please confirm or deny that this is a valid approach to attach trial components of HPO training jobs to a trial?

danabens commented 3 years ago

Ya that works, the trial component name contains the training job name, and the training job name contains the tuning job name.

lorenzwalthert commented 2 years ago

It would save a ton of boilerplate code if we could just specify the experiment and trial when creating the hyperparameter job.

mbbourgo commented 2 years ago

It would save a ton of boilerplate code if the HyperparameterTuner.fit() method passed its "kwargs" argument to the Estimator.fit() method. Then, you can supply the "experiment_config" dict with just an ExperimentName and let the HyperparameterTuner create the trials for each run.