Azure / azure-dev

A developer CLI that reduces the time it takes for you to get started on Azure. The Azure Developer CLI (azd) provides a set of developer-friendly commands that map to key stages in your workflow - code, build, deploy, monitor, repeat.
https://aka.ms/azd
MIT License
387 stars 177 forks source link

[Regression] multi-tenancy support - InvalidAuthenticationTokenTenant Error #3485

Closed WhitWaldo closed 2 months ago

WhitWaldo commented 3 months ago

Output from azd version Run azd version and copy and paste the output here:

azd version 1.6.1 (commit eba2c978b5443fdb002c95add4011d9e63c2e76f)

Describe the bug I've got a very simple Aspire deployment that failed. You can see it adds Cosmos to my application and.. that's it.

var builder = DistributedApplication.CreateBuilder(args);

var cosmos = builder.AddAzureCosmosDB("cosmos");

builder.AddProject<Projects.Utilities>("utilities")
    .WithReference(cosmos);

builder.Build().Run();

I ran the following commands:

azd login
azd init
azd up

This successfully completed the step that reads "SUCCESS: Your application was provisioned in Azure in 2 minutes 6 seconds." and I see that it deployed a Container Registry, a Container Apps environment, Cosmos DB, Log Analytics, and a Managed Identity.

It failed upon attempting to deploy the services:

ERROR: failed deploying service 'utilities': failing invoking action 'deploy', failed executing template file: template: containerApp.tmpl.yaml:21:19: executing "containerApp.tmpl.yaml" at <connectionString "cosmos">: error calling connectionString: POST https://management.azure.com/subscriptions/5cec4120-f436-4a94-8a6e-************/resourceGroups/rg-myapp-web/providers/Microsoft.DocumentDB/databaseAccounts/cosmosholua67mc6kj2/listConnectionStrings
--------------------------------------------------------------------------------
RESPONSE 401: 401 Unauthorized
ERROR CODE: InvalidAuthenticationTokenTenant
--------------------------------------------------------------------------------
{
  "error": {
    "code": "InvalidAuthenticationTokenTenant",
    "message": "The access token is from the wrong issuer 'https://sts.windows.net/f8cdef31-a31e-4b4a-93e4-************/'. It must match the tenant 'https://sts.windows.net/efbbaae1-f282-4ff1-8f02-************/' associated with this subscription. Please use the authority (URL) 'https://login.windows.net/efbbaae1-f282-4ff1-8f02-************' to get the token. Note, if the subscription is transferred to another tenant there is no impact to the services, but information about new tenant could take time to propagate (up to an hour). If you just transferred your subscription and see this error message, please try back later."
  }
}
--------------------------------------------------------------------------------

I haven't transferred anything on this tenant in months (years?) and this is my first attempt to ever try deploying anything to it. Notably weird here too is that I have this message only after successfully creating all the other resources. It's only when it attempts to get the connection strings from one of those deployed resources that it all comes apart.

It appears to be something wrong with this tenant, but it's unclear what. I tried deploying to another (much newer) tenant just for fun and it deployed without issue following an azd logout and an azd login. No idea what the difference is between the two though.

To Reproduce I can reproduce every time I attempt to deploy to any subscription on the older tenant. Let me know what you want done and I can pull logs for you. As I don't know what's different about that tenant and my newer one, I couldn't articulate how precisely to produce this.

Expected behavior Should have completed an end-to-end deployment without issue.

Environment Information on your environment:

Additional context Originally filed at https://github.com/dotnet/aspire/issues/2668 but refiling after being asked to.

rajeshkamal5050 commented 3 months ago

Good to know that it works fine on your newer tenant.

Can you try using the templates option on your failing older tenant? Thinking it should also fail for non-Aspire scenarios too.

azd init Initializing an app to run on Azure (azd init) How do you want to initialize your app? Select a template Select a project template: todo-csharp [Use arrows to move, type to filter] React Web App with C# API and MongoDB (Azure-Samples/todo-csharp-cosmos-sql) React Web App with C# API and SQL Database (Azure-Samples/todo-csharp-sql) Static React Web App + Functions with C# API and SQL Database (Azure-Samples/todo-csharp-sql-swa-func)

@vhvb1989 @weikanglim seems related to multitenancy support. Anything else, we need from @WhitWaldo to triage?

WhitWaldo commented 3 months ago

I tried a deployment for the "todo-csharp-sql" and it worked without issue.

I tried again with something that specifically targeted Cosmos ("Blazor with Cosmos, Open AI" or something) and it failed for lack of sufficient OpenAI quota (my guess is the resource isn't actually provisioned as again, it's an old tenant/subscription).

Tried once more with a React project (only other one that mentioned Cosmos) - deployment went fine, so couldn't repro there either.

Still couldn't deploy my own Aspire-based project. Same error.

weikanglim commented 3 months ago

@rajeshkamal5050 No, we have what we need -- it's pretty clearly broken.

  1. We removed tenant specific TokenCredential from DI graph awhile back: #1763
  2. We readded TokenCredential to DI graph, this time using the default tenant: #2765
  3. We started using TokenCredential extensively recently, and this issue has expanded: https://github.com/search?q=repo%3AAzure%2Fazure-dev%20azcore.TokenCredential&type=code
rajeshkamal5050 commented 3 months ago

@weikanglim can you pick up this PR #3528? or create a new one?