Azure / azure-sdk-for-python

This repository is for active development of the Azure SDK for Python. For consumers of the SDK we recommend visiting our public developer docs at https://learn.microsoft.com/python/azure/ or our versioned developer docs at https://azure.github.io/azure-sdk-for-python.
MIT License
4.53k stars 2.76k forks source link

When register custom models with python sdk,The register API does not have option to pass associated Job(RunID) with it #36551

Open sweanan opened 1 month ago

sweanan commented 1 month ago

Package Name azureml.core.model Package Version: 1.55.0


Is your feature request related to a problem? Please describe.

PROBLEM:

When I try to register custom model with Mode.Register() API as below: model = Model.register(ws, model_path=model_path, model_name=model_name)

The model gets registered , but the Registered Model Artifact does not have associated Job(RunId) which created this Model Artifact. Our customers would like to have this RunID associated with the artifact to track the deployments and model registrations.

image

Work Around : we have used the internal API as below which allow's us to pass Job(RunID) as input parameter. (This below API is internally called by the Model.Register() as well)

asset = Model._create_asset(ws.service_context, model_path, model_name)

model = Model._register_with_asset( ws, model_name=model_name, asset_id=asset.id, run_id=parent_run_id )

With this we are able to associate the RunID with Model Artifact


Describe the solution you'd like

The solution would be to provide an option of adding run_id as an input parameter to the Model.Register() API


Describe alternatives you've considered

Have described the Work Around we are using Above.

Also i tried testing it with multiple other API to register a custom model as below, and none of them worked

model = ml_client.models.create_or_update(cloud_model)

model = ml.register_model(model_input, model_name)

Additional context Add any other context or screenshots about the feature request here.

github-actions[bot] commented 1 month ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @Azure/azure-ml-sdk @azureml-github.

achauhan-scc commented 1 month ago

@isaudagar - assigning to you as its V1 issue.

isaudagar commented 1 month ago

Hi @sweanan, Could you please provide some information about how you are generating pkl file. Please share repro steps that would be helpful to find out root cause.

sweanan commented 1 month ago

Hello @isaudagar Thank you for looking into this, below are the steps that we are using to generate the model artifact (we are saving the tokenizer.pkl along with the model.pt and the related files required as part of the model artifact registration)

WLM Transformer Model:

Train.py: Saving model as well as tokenizer torch.save(model.state_dict(), model_path) with open(self.tokenizer_path, "wb") as f: pickle.dump(t, f)

Predict.py: To Load the model model.load_state_dict(torch.load(model_path, map_location=device))

Register.py: asset = Model._create_asset(ws.service_context, model_path, model_name) model = Model._register_with_asset( ws, model_name=model_name, asset_id=asset.id, run_id=parent_run_id )

Below is the screen shot of the artifact, Please do let me know if you do need any other information image

@robcamer @marshallbentley

isaudagar commented 1 month ago

Hi @sweanan, Try to reproduce but not able to repro. When created custom model and able to see job(RunID). Could you please try with azureml-core 1.56.0 image

sweanan commented 1 month ago

Hello @isaudagar will test it with that particular version and will get back to you

Thanks Swetha

isaudagar commented 1 month ago

Sure @sweanan, Please refer below some links. hope this will be helpful to create, train model and deploy model. https://learn.microsoft.com/en-us/azure/machine-learning/tutorial-designer-automobile-price-train-score?view=azureml-api-1 https://learn.microsoft.com/en-us/azure/machine-learning/tutorial-1st-experiment-sdk-train?view=azureml-api-1 https://learn.microsoft.com/en-us/azure/machine-learning/concept-train-machine-learning-model?view=azureml-api-1#run-configuration

sweanan commented 1 month ago

Hello @isaudagar

I tried with the azureml-core 1.56.0 version but still see the same issue

image

sweanan commented 1 month ago

Hi @isaudagar

will you be able to share the code for Model.register() api that worked for you?

Also, we tried with run.register_model() and that creates the associated job id, but fails to associate a dataset (but this is azure.core.run package not the azure.core.model package) and the above issue still persist with azure.core.model package

isaudagar commented 1 month ago

Hi @sweanan,

if you want to link job with model please use below sample code. from azureml.core import Workspace, Experiment, Run

Connect to the workspace and retrieve the run

ws = Workspace.from_config() experiment = Experiment(workspace=ws, name='your-experiment-name') run = Run(experiment=experiment, run_id='your-run-id')

Register the model directly from the run

model = run.register_model( model_name='my_model', model_path='outputs/model.pkl', # Path within the run's outputs description='A model registered from a specific run', tags={'project': 'my_project', 'framework': 'Scikit-learn'}, model_framework=Model.Framework.SCIKITLEARN, model_framework_version='0.24.1' )

print(f"Model {model.name} registered with ID: {model.id} and Version: {model.version}")

sweanan commented 1 month ago

Hello @isaudagar

Work with registered models in Azure Machine Learning

This is what appears as documentation when i search for ""Register a model in azure ml" and "register a model from a run in azure ml"

Utilizing the run.register_model is a work around for the Model.register() method. We can utilize this for the time being.

run.register_model (and Model.register()) have further bugs with their implementation of the datasets parameter. Our customer would additionally like to associate the dataset that was used to train the model with the registered model. This functionality does not seem to work with any API (internal or external).

So we see below bugs