aws / sagemaker-experiments

Experiment tracking and metric logging for Amazon SageMaker notebooks and model training.
Apache License 2.0
125 stars 36 forks source link

Tracker object doesn't write metrics inside of Processing Job #121

Closed jmgray24 closed 3 years ago

jmgray24 commented 3 years ago

Currently, I am having issues writing metrics using the Tracker class log_metric from a Processing Job

It appears that metrics_writer class only creates a metrics writer if it detects it is within a Training Job. Is this intentional?

https://github.com/aws/sagemaker-experiments/blob/1f1a0b780bdee42c8293d032d99279af6e44ded0/src/smexperiments/tracker.py#L108

danabens commented 3 years ago

Hi @jmgray24, yes this is intentional. log_metric works by first writing the metrics to a file, which is then picked up by a metrics agent, which inserts the metrics into SageMaker. This agent only runs on training job hosts. Message me via slack if you would like more detail.

lorenzwalthert commented 2 years ago

It would be useful for processing jobs as well.