aws / sagemaker-experiments

Experiment tracking and metric logging for Amazon SageMaker notebooks and model training.
Apache License 2.0
125 stars 36 forks source link

Unable to create a Tracker #113

Closed anotinelg closed 3 years ago

anotinelg commented 3 years ago

Describe the bug In a jupyter notebook, i am not able to create a tracker ! I am not sure if it is really a bug, or a misconception from my part.

To Reproduce Steps to reproduce the behavior:

in a jupyter notebook:

with tracker.Tracker.create(display_name=f"test-tracker",
                    ) as trial_tracker:
        trial_tracker.log_parameter('test', 0.01)

this fails to:

ClientError: An error occurred (ValidationException) when calling the CreateTrialComponent operation: Trial Component creation is currently restricted to the SageMaker runtime. Try supplying an experiment config when creating a job instead.

Expected behavior If i understand the (examples)[https://github.com/shashankprasanna/sagemaker-experiments-examples/blob/master/sagemaker-experiment-examples.ipynb], this code should not give any problem:

Screenshots If applicable, add screenshots to help explain your problem.

Environment: jupyter notebook version of: sagemaker_experiments= '0.1.25' sagemaker =2.19.0

Additional context Add any other context about the problem here.

matkalinowski commented 3 years ago

Are there any news on this or possible solutions? I am having the same issue when trying to run sagemaker pytorch estimator with instance_type = 'local'.

wangweixun commented 3 years ago

Hi there, sorry for the late response. Can I ask where are you running the code from? The Python SDK code above calls CreateTrialComponent API which is restricted to SageMaker environments only: https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTrialComponent.html.

anotinelg commented 3 years ago

Hi there, sorry for the late response. Can I ask where are you running the code from? The Python SDK code above calls CreateTrialComponent API which is restricted to SageMaker environments only: https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_CreateTrialComponent.html.

In my case, i was trying to do that in a jupyter sagemaker notebook.

danabens commented 3 years ago

Hi Antoine, you should use Tracker.load instead of Tracker.create in this case. The load method is the most common way to instantiate the tracker.

tf_estimator.fit(...job_name = job_name...)

tracker = Tracker.load(training_job_name=job_name, ...)
tracker.log_parameters(hyperparams)
matkalinowski commented 3 years ago

@wangweixun Thank you for the resposne. So it is not possible to run those experiments locally and log the results to the AWS experiments? (even from the supported docker image?)

danabens commented 3 years ago

is not possible to run those experiments locally and log the results to the AWS experiments? (even from the supported docker image?)

It is not possible. In order for metrics to show up in Studio/Experiments log_metric must be called from a training job instance.