microsoft / botbuilder-js

Welcome to the Bot Framework SDK for JavaScript repository, which is the home for the libraries and packages that enable developers to build sophisticated bot applications using JavaScript.
https://github.com/Microsoft/botframework
MIT License
680 stars 276 forks source link

UserAssignedIdentity(WorkloadIdentity) auth fails with 'scope https://api.botframework.com is not valid' #4582

Closed patst closed 9 months ago

patst commented 10 months ago

Hi! šŸ‘‹

Firstly, thanks for your work on this project! šŸ™‚

Today I used patch-package to patch botframework-connector@4.21.3 for the project I'm working on.

I am using the botbuilder with the msteams connector. My configuration uses a UserAssignedMSI and the botframeworkauthentication is configured like this:

const botFrameworkAuthentication = new ConfigurationBotFrameworkAuthentication(
    {},
    new ConfigurationServiceClientCredentialFactory({
      MicrosoftAppId: config.botId,
      MicrosoftAppType: "UserAssignedMSI", // Bot test framework can only handle "MultiTenant",
      MicrosoftAppTenantId: config.botTenantId,
    })
);

I use it in conjunction with Azure Workload Identity and my bot is running inside a Pod deployed in AKS.

I get an error when a response should be returned to the bot service: (while fetching the token)

azure:identity:warning IdentityClient: authentication error. HTTP status: 400, AADSTS70011: The provided request must include a 'scope' input parameter. The provided value for the input parameter 'scope' is not valid. The scope https://api.botframework.com is not valid. Trace ID: f136be63-1f12-4d80-94be-804dabe01600 Correlation ID: bcc381f3-2e02-455d-a46c-fd3705f24eb4 Timestamp: 2023-12-11 13:59:55Z

The correct scope would be https://api.botframework.com/.default

I spend some debugging and found a diff comparing the UserAssignedMSI vs the SingleTenant code branch. Single Tenant version: https://github.com/microsoft/botbuilder-js/blob/f3db3e98bb139c7aecc921483ea188574de7aada/libraries/botbuilder-core/src/configurationServiceClientCredentialFactory.ts#L97-L130

If you drill further down it is clear that the audience is taken as input and then used as scope in the oauth flows. This only works, if /.default is appended to the scope. For the single tenant version this is done in the msalAppCredentialsclass: https://github.com/microsoft/botbuilder-js/blob/f3db3e98bb139c7aecc921483ea188574de7aada/libraries/botframework-connector/src/auth/msalAppCredentials.ts#L108-L112

In the UserAssignedMSI version the scope is taken without any further modification: https://github.com/microsoft/botbuilder-js/blob/f3db3e98bb139c7aecc921483ea188574de7aada/libraries/botframework-connector/src/auth/managedIdentityAppCredentials.ts#L37

In order to use the same logic like in the SingleTenant version here is the diff that solved my problem:

diff --git a/node_modules/botframework-connector/src/auth/managedIdentityAppCredentials.ts b/node_modules/botframework-connector/src/auth/managedIdentityAppCredentials.ts
index cc19c1b..fb4f414 100644
--- a/node_modules/botframework-connector/src/auth/managedIdentityAppCredentials.ts
+++ b/node_modules/botframework-connector/src/auth/managedIdentityAppCredentials.ts
@@ -34,7 +34,13 @@ export class ManagedIdentityAppCredentials extends AppCredentials {

         this.tokenProviderFactory = tokenProviderFactory;
         super.appId = appId;
-        this.authenticator = new ManagedIdentityAuthenticator(this.appId, this.oAuthScope, this.tokenProviderFactory);
+
+        const scopePostfix = '/.default';
+        let scope = this.oAuthScope;
+        if (!scope.endsWith(scopePostfix)) {
+            scope = `${scope}${scopePostfix}`;
+        }
+        this.authenticator = new ManagedIdentityAuthenticator(this.appId, scope, this.tokenProviderFactory);
     }

     /**

This issue body was partially generated by patch-package.

What do you think? If you agree I can prepare an pull request for the change

ceciliaavila commented 9 months ago

Hi @patst, we couldn't reproduce the error using a UserAssignedMSI bot deployed in an Azure App Service. We are working on deploying the bot to an AKS cluster, and it would be helpful if you could provide the steps you followed to deploy your bot and configure the Azure Workload Identity. Thanks!

patst commented 9 months ago

Hi @patst, we couldn't reproduce the error using a UserAssignedMSI bot deployed in an Azure App Service. We are working on deploying the bot to an AKS cluster, and it would be helpful if you could provide the steps you followed to deploy your bot and configure the Azure Workload Identity. Thanks!

@ceciliaavila thanks for your message.

I created a little example app to reproduce the error. See the repository at https://github.com/patst/botbuilder-js-4582

I added some kubernetes manifests in the manifests folder. The configuration in the Azure Portal follows the docs provided on the AKS pages (https://learn.microsoft.com/en-us/azure/aks/workload-identity-overview?tabs=dotnet#microsoft-authentication-library-msal )

Hope that helps

I think the main difference is the ManagedIdentity Credentials (used in the AppService) call the IMDB endpoint at 169.254.169.254 which somehow accepts the scope (or uses another?) and the WorkloadIdentityCredentials use https://login.microsoftonline.com which rejects the invalid scope

ceciliaavila commented 9 months ago

I @patst, thanks for the information. We managed to deploy the application in the cluster and enable workload identity, but we are struggling to create the ingress and the service to access the bot. Do you have the steps or the manifests for this? We are following these two guides, but we are not sure if we are missing something. https://learn.microsoft.com/en-us/azure/aks/ingress-basic?tabs=azure-cli https://learn.microsoft.com/en-us/azure/aks/ingress-tls?tabs=azure-cli#create-an-ingress-controller Thanks!

patst commented 9 months ago

I @patst, thanks for the information. We managed to deploy the application in the cluster and enable workload identity, but we are struggling to create the ingress and the service to access the bot. Do you have the steps or the manifests for this? We are following these two guides, but we are not sure if we are missing something. https://learn.microsoft.com/en-us/azure/aks/ingress-basic?tabs=azure-cli https://learn.microsoft.com/en-us/azure/aks/ingress-tls?tabs=azure-cli#create-an-ingress-controller Thanks!

hey @ceciliaavila , thanks for working on it. I added a ingress and service definition to the example repository.

I addition to that, you will need a valid TLS certificate for the ingress. You could use certmanager for that. What problems are you facing exactly? Maybe the AKS team can give you a hand on getting the cluster up and running.

ceciliaavila commented 9 months ago

I @patst, thanks for the information. We managed to deploy the application in the cluster and enable workload identity, but we are struggling to create the ingress and the service to access the bot. Do you have the steps or the manifests for this? We are following these two guides, but we are not sure if we are missing something. https://learn.microsoft.com/en-us/azure/aks/ingress-basic?tabs=azure-cli https://learn.microsoft.com/en-us/azure/aks/ingress-tls?tabs=azure-cli#create-an-ingress-controller Thanks!

hey @ceciliaavila , thanks for working on it. I added a ingress and service definition to the example repository.

I addition to that, you will need a valid TLS certificate for the ingress. You could use certmanager for that. What problems are you facing exactly? Maybe the AKS team can give you a hand on getting the cluster up and running.

Hi @patst, thanks for all your help, we were finally able to reproduce the error. We'll be reviewing the fix you proposed. Thanks!