Closed andrewblance closed 1 year ago
@nicoleserafino Could you help us on this issue?
have I potentially missed a step (or is the documentation currently missing a step?) when you have to declare the variable?
Nothing is declared in the data-explorer or key-vault modules, and secrets aren't discussed in either run-terraform-apply.yml
or tf-ado-deploy-infra.yml
. Also, there aren't any variables in the pipeline:
Am I missing something obvious maybe? How is this secret being passed to $(CLIENT_SECRET)?
Edited to add more information:
kvmonitoringspkey
is created when monitoring is enabled, and is all taken care of within data-explorer/main.tf
. Kinda relevant to this: why is app insights also created in dev? And why, when it is created, a data explorer is not made alongside it?@andrewblance Let me review this. You are right that this issue is related to the monitoring via the Data Explorer. If you don't need the monitoring or data drift, please disable the setup of the Azure Data Explorer. If you want to make it work, then you will need to setup _clientsecret. We haven't documented that step correctly either. Let us review the steps and document those, so we can show how to correctly pass the secret to the config file.
@andrewblance You can find the client secret being created in this template: https://github.com/Azure/mlops-templates/blob/main/templates/infra/create-sp-variables.yml
$(CLIENT_SECRET) becomes an environment variable in the ADO agent when you run the infra pipeline (see line 50 here: https://github.com/Azure/mlops-project-template/blob/main/infrastructure/terraform/pipelines/tf-ado-deploy-infra.yml)
However, I am not able to reproduce your problem :( When I run the pipeline and then look in key vault, I'm able to see a secret that's 56 characters long...
There is a separate issue that we are working to solve. It turns out when you use Terraform to inject a secret into key vault, it puts single-quotes around the secret value, and I thought that might be why your authentication is failing... We are going to migrate the key management to use AZ CLI instead because of this, but you'll have to bear with us while we push out the changes.
Can you try running the sparse checkout again, with the aml-cli-v2 classical example and rerunning the terraform pipeline? We've made some changes recently to the IAC. Please check that "enable_monitoring" is set to true in the config-infra-prod.yml file, as we have it off by default in our main branch.
Finally, delete your unused ADX resources :) They've been known to cost a pretty penny...
@cindyweng Thank you so much! That was incredibly useful!
I think I found the cause - in the version of tf-ado-deploy-infra.yml
I had I did not have the same line 50, and therefore was not calling the create-sp-variables
template. When I originally cloned the file must have been a little different (looking at the git history suggest this may be the case, apologies). I added it in and reran sparse checkout. It works now!
I bumped into the single-quote issue you mentioned - but I created a new version of the key, and everything works now. Thank you for helping me here!
I note that some of the features between AZ CLI and the Python SDK are slightly different (there is no Python SDK pipeline to create an endpoint). I think I will try to make this myself (once I work through the drift monitoring pipelines), but I was curious if there was a reason the CLI version was created before the Python one - is the CLI Microsoft's suggested method of creating pipelines and interacting with AML in scenarios like this?
@andrewblance with regards to your question on why CLI was done first, it came down to the choice that we had to pursue. We decided to first focus on the CLI and get it ready so that a larger number of folks can use it, especially it doesn't require you to learn python. Once that is ready, we can work on the Python SDK example to get similar examples in the repo.
We are also working out the logistics to allow for contributions on the various ML methods or ideas on the improvement for the workflow. We don't want the repo to get so complicated that it loses its appeal.
Hope that helps. Happy to have a broader conversation with you in case you are interested to contribute.
Hello,
I have been following your quick start guide, and have got to the stage where I need to deploy the pipeline "deploy-model-training-pipeline.yml" on Azure DevOps.
When I run this, it goes as far as the
Run pipeline in AML
step in DevOps, then I get this error:I have done some investigating:
kvmonitoringspkey
- looking at this though the Secret Value is just $(CLIENT_SECRET) - is it meant to be this? If so, why? And where was is set to this?Do you have any advice on how I can fix this error?