microsoft / service-fabric

Service Fabric is a distributed systems platform for packaging, deploying, and managing stateless and stateful distributed applications and containers at large scale.
https://docs.microsoft.com/en-us/azure/service-fabric/
MIT License
3.02k stars 399 forks source link

DefaultAzureCredential never works with AzureCLI when Developing Locally #1418

Open JackWitherell opened 1 year ago

JackWitherell commented 1 year ago

Otherwise, select triage-needed Bug: Unable to use DefaultAzureCredential()

Describe the issue:

When using DefaultAzureCredential() in Azure, everything works great. Authentication is picked up, UserAssignedRoles (MSI) work perfectly, but when developing locally there's a problem.

DefaultAzureCredential uses Azure CLI or powershell auth (or various other methods, all of which aren't working for me and I'm trying to fix them separately) to pick up credentials. I can't use Powershell Auth. When new DefaultAzureCredential() is run, I get the following error.

Azure CLI authentication failed due to an unknown error. See the troubleshooting guide for more information. https://aka.ms/azsdk/net/identity/azclicredential/troubleshoot Traceback (most recent call last):
  File "runpy.py", line 196, in _run_module_as_main
  File "runpy.py", line 86, in _run_code
  File "D:\a\_work\1\s\build_scripts\windows\artifacts\cli\Lib\site-packages\azure/cli/__main__.py", line 39, in <module>
  File "D:\a\_work\1\s\build_scripts\windows\artifacts\cli\Lib\site-packages\azure/cli/core/__init__.py", line 895, in get_default_cli
  File "D:\a\_work\1\s\build_scripts\windows\artifacts\cli\Lib\site-packages\azure/cli/core/azlogging.py", line 30, in <module>
  File "D:\a\_work\1\s\build_scripts\windows\artifacts\cli\Lib\site-packages\azure/cli/core/commands/__init__.py", line 25, in <module>
  File "D:\a\_work\1\s\build_scripts\windows\artifacts\cli\Lib\site-packages\azure/cli/core/extension/__init__.py", line 18, in <module>
  File "D:\a\_work\1\s\build_scripts\windows\artifacts\cli\Lib\site-packages\knack/config.py", line 40, in __init__
  File "D:\a\_work\1\s\build_scripts\windows\artifacts\cli\Lib\site-packages\knack/util.py", line 115, in ensure_dir
  File "D:\a\_work\1\s\build_scripts\windows\artifacts\cli\Lib\site-packages\knack/util.py", line 112, in ensure_dir
  File "os.py", line 225, in makedirs
PermissionError: [WinError 5] Access is denied: 'C:\\Windows\\system32\\config\\systemprofile\\.azure'

This isn't where my credentials are stored. They're stored in my user directory (C:\Users\jwitherell\.azure)

According to this issue from 2019, "Service Fabric is picking the wrong location to look for credentials", but in trying to figure out if that was true, I found this issue. This states that while Visual Studio is running in Administrator mode, then the Service Fabric process may not be able to look in the right place for credentials. So in other words, it may not necessarily be Service Fabric's fault, it may simply be the fact that I'm running as Administrator.

However, when I try to run my Service Fabric Application in Visual Studio...

image

Oh no.

Based on the feedback and the responses from the links above, I'm inclined to believe that this issue lies on Service Fabric to fix. Azure CLI states that they don't point credentials to System32 and Azure SDK for Net states that Service Fabric can't point to the right location when telling Azure CLI where to look for credentials. I found this rather frank developer discussion that shows that Service Fabric has a firm stance on requiring Administrator VS when testing code locally. Removing this requirement would be very beneficial and I'm sure it's been brought up a lot.

This has a huge impact to my ability to develop.

Describe as best as possible how to reproduce the issue:

  1. Buy a new computer and set it up (or use Windows Sandbox, invaluable for testing out of the box issues)
  2. Install Visual Studio, Azure CLI and the Service Fabric SDK
  3. Ensure Azure CLI can see credentials based on their troubleshooting steps (it can, for me!)
  4. In Visual Studio, Create a new Service Fabric Application, and a Stateless Service
  5. Use new DefaultAzureCredential(), possibly block out a few of the more stable Auth options using something like new DefaultAzureCredential(new DefaultAzureCredentialOptions { ExcludeAzurePowerShellCredential = true })
  6. Run locally, you should maybe have to catch the error like what is shown below.

Result: image

Admittedly I'm a little bit at the end of my rope. I couldn't find enough information online to verify if I'm missing something, but if this is/was a known issue and there's an inherent workaround, please let me know. Do I need a new version of something? I've tried updating everything I can, and I've also tried in several ways to develop SF locally without needing Administrator. I'd love to hear that there's a way to override how AzureCLI is being used to point it to the correct directory.

williamoconnorme commented 1 year ago

I've a workaround for this which uses the Az CLI token.

https://gist.github.com/williamoconnorme/ffd88e2206abedee4a436cee7c416d39

This script uses PsExec to login using the AZ CLI with the NETWROK SERVICE account. Service Fabric by default runs under this account so your code running on the cluster will be able to access the token. I'm not sure how often the token may need to be refreshed but it might work for your scenario