Open JimMadge opened 2 months ago
As we already follow Pulumi's instructions for using Az CLI auth, this might be out of our control.
The solution may be to add documentation about this edge case. It would be worth deploying from other environments to see if this problem can occur other ways.
Original error message
This may be because of running the deployment from Powershell, which sets environment variables differently. I did not encounter this error when running from bash on a Linux Azure VM
This may be because of running the deployment from Powershell, which sets environment variables differently. I did not encounter this error when running from bash on a Linux Azure VM
That is interesting 🤔. We don't set environment variables, we pass them as arguments to the Pulumi automation API routine. It is possibly an upstream bug. I think I would find that surprising. I'd want to confirm that it happens when running from powershell and not from some POSIX-like shell.
This may be because of running the deployment from Powershell, which sets environment variables differently. I did not encounter this error when running from bash on a Linux Azure VM
That is interesting 🤔. We don't set environment variables, we pass them as arguments to the Pulumi automation API routine. It is possibly an upstream bug. I think I would find that surprising. I'd want to confirm that it happens when running from powershell and not from some POSIX-like shell.
Yes, that'll be the next thing to try, as it's just conjecture rather than confirmed.
The important thing is not having the VM have a system assigned managed identity. If it has one, then it tries to use that to read secrets etc. So make sure that when creating the VM the option to give it a managed id is unticked.
OK, that's a simple-enough thing to add to the docs with a big warning box, right?
Yes - but I think it might be worth adding a whole section for deploying from an Azure VM, as this only gets you through the door. You'll still need to change the storage account routing from Microsoft
to Internet
. Maybe we need a "common problems when deploying from an Azure VM" section or similar.
You'll still need to change the storage account routing from Microsoft to Internet.
Do we understand the implications of doing that? For example, will it degrade storage performance within SREs?
It sounds like there are more problems here than the managed identity?
My feeling is still we should advise against deploying in this way. It looks like we would have to make changes to the infrastructure we deploy to support it. I don't like the idea of having the TRE vary depending on what host you have deployed from. And in this case, presumably if I wanted to update an SRE deployed from a Azure VM from my local machine, it would end up make changes (and possibly recreating) resources.
Also, if we do go through the effort to support it now, we will need to continue that in the future, which adds burden to our testing/validation.
I think there is a risk that we are missing the real issue here too. What is it about the way we are currently distributing the code that makes deploy a VM on Azure, and install it there, the best way to use it?
I agree with @JimMadge. Is deploying a Linux VM, with managed identity turned off, sufficient that you can then follow the normal deployment instructions?
:white_check_mark: Checklist
:computer: System information
:package: Packages
List of packages
```none Paste list of packages here ```:no_entry_sign: Describe the problem
Related to #2184
Digging into that issue, with more verbose logging we found, When deploying from an Azure Virtual Machine,
Quite possibly Azure Environment Authentication is being used. This is in spite of
AZURE_KEYVAULT_AUTH_VIA_CLI
being sethttps://github.com/alan-turing-institute/data-safe-haven/blob/cdd76a3b51a8278ab1562f90ab8637ad8ab5276b/data_safe_haven/external/interface/pulumi_account.py#L38-L42
Perhaps this is always being ignored on an Azure VM
:deciduous_tree: Log messages
Relevant log messages
```none Your log details here ```:recycle: To reproduce