Azure / mlops-v2

Azure MLOps (v2) solution accelerators. Enterprise ready templates to deploy your machine learning models on the Azure Platform.
https://learn.microsoft.com/en-us/azure/machine-learning/concept-model-management-and-deployment
MIT License
503 stars 248 forks source link

CV Job Fails with error in train.py (standard project template) #93

Closed jplummer01 closed 1 year ago

jplummer01 commented 1 year ago

Describe the bug or the issue that you are facing

image image

Steps/Code to Reproduce

Deploy the CV Sample and run deploy-cv-model-training-pipeline.yml

Expected Output

Expecting sample to run succesfully and train the model

Versions

as per MLOps-V2 accelerator and project templates

Which platform are you using for deploying your infrastrucutre?

GitHub Actions (GitHub)

If you mentioned Others, please mention which platformm are you using?

No response

What are you using for deploying your infrastrucutre?

Terraform

Are you using Azure ML CLI v2 or Azure ML Python SDK v2

Azure ML CLI v2

Describe the example that you are trying to run?

prebuilt CV Example Job_train_OutputsAndLogs.zip

sdonohoo commented 1 year ago

This is due to a bug in mlflow integration with Azure ML. Currently, mlflow is automatically logging all parameters passed to the job by default. Any explicit mlflow.log_params() call that duplicates one of these parameters will cause this failure. Recommended workaround at this time is to remove explicit calls to mlflow.log_params() from the training code.

setuc commented 1 year ago

I am assuming this is resolved for you @jplummer01