General availability for `log_metric()` in processing jobs and beyond

lorenzwalthert commented 2 years ago

Is your feature request related to a problem? Please describe. Yes. Coming from MLflow, I can log metrics from anywhere. It works locally, in a training job, in a processing job, on a spark cluster etc. I want to use sagemaker experiments to track not just the training of a model, but also the pre- and post-processing of the features, in particular during sagemaker processing jobs. Examples:

Know how many observations I dropped during the pre-processing.
After batch-transform inference, I my want to log some custom metrics back to Sagemaker Experiments, e.g. the precision of the model on the new data.

Original issue: https://github.com/aws/sagemaker-experiments/issues/151#issuecomment-1114991598

Describe the solution you'd like

I'd like to be able to call tracker.log_metric() from a processing run, ideally even from anywhere (also locally while debugging).

# some omitted import statements
with Tracker.create(display_name="evaluation", sagemaker_boto_client=sm) as tracker:
        tracker.log_metric(metric_name="..." value='my-value', timestamp = datetime.datetime.now())

Describe alternatives you've considered

Building my own metadata store in S3, which creates friction between my custom solution and Sagemaker Experiments.

Additional context Add any other context or screenshots about the feature request here.

morfaer commented 2 years ago

Any update on the timeline when this will be available?

Bustami commented 2 years ago

Hi @lorenzwalthert It would be great to have this issue already solved. I was looking for something similar, specially for custom metrics recording on test dataset or new data performance stuff. Anyway, for the moment I got this by using the function log_parameters() and it worked properly. Did you try with that?

lorenzwalthert commented 2 years ago

Are you suggesting to log metrics as parameters? Sounds like a hack to me. 🙃

Bustami commented 2 years ago

haha ok, at least meanwhile. Anyway, with that approach you are able to set what parameters are shown when comparing trials or their components (is it the goal, is not?.

ycrouin commented 1 year ago

We can now send metrics using an API :pray:

aws sagemaker-metrics batch-put-metrics --trial-component-name my-trial-component-base-2022-12-09-094608133627-rbyz --metric-data MetricName=myAPImetric,Timestamp=2023-04-22T18:30:22.088Z,Step=0,Value=42

studio

It would be nice to have it integrated with the Tracker object from the sagemaker-experiments package

kirit93 commented 1 year ago

@lorenzwalthert - you can log metrics from anywhere using the SageMaker Python SDK - https://sagemaker.readthedocs.io/en/stable/experiments/sagemaker.experiments.html#run.

lorenzwalthert commented 1 year ago

Yesm I saw that, thanks. Is this repo still under development or are you planning to archieve it?

aws / sagemaker-experiments

General availability for `log_metric()` in processing jobs and beyond #160