Azure / azure-devops-cli-extension

Azure DevOps Extension for Azure CLI
https://docs.microsoft.com/en-us/cli/azure/ext/azure-devops/?view=azure-cli-latest
MIT License
617 stars 239 forks source link

[Feature Request] Rewrite AAD token authentication #1298

Open jikuja opened 1 year ago

jikuja commented 1 year ago

Is your feature request related to a problem? Please describe. AAD token authentication is really flaky, hides error messages and has terrible UX

Splitted from issue #1258

Describe the solution you'd like Rewrite most of the related code

Background Story

Current status

  1. Extension uses subscription information from AZ CLI(1) subscriptions = profile.load_cached_subscriptions(False) loops through the information
  2. tenant of the active subscription is added as first item if tenantsDict (2)
  3. other tenant information from the subscription list is being collected into tenantsDict
  4. the code goes through collected tenants from tenantsDict
    • token for tenant is being fetch from AZ CLI: token = get_token_from_az_login(profile, key[0])
      1. problems on this step described later
    • returned token is validated with ADO endpoint: validate_token_for_instance(organization, credentials)
      1. if validation passed then correct token/tenant was found and and the token can be used
      2. if validation failed than token us not usable
      3. if suitable token not found then empty token is returned from get_token_from_az_logins() and _get_credentials() will raise exception
      4. issues with validate_token_for_instance() is described later

get_token_from_az_login() problems

https://github.com/Azure/azure-devops-cli-extension/blob/b3d0392d597a2eae5229e96059359d00fbb2e222/azure-devops/azext_devops/dev/common/services.py#L155-L165

validate_token_for_instance() problems

https://github.com/Azure/azure-devops-cli-extension/blob/b3d0392d597a2eae5229e96059359d00fbb2e222/azure-devops/azext_devops/dev/common/services.py#L86-L98

get_token_from_az_logins() problem

https://github.com/Azure/azure-devops-cli-extension/blob/b3d0392d597a2eae5229e96059359d00fbb2e222/azure-devops/azext_devops/dev/common/services.py#L116-L152

Issues

Proposed fixes

Add --tenant parameter

If user can applies tenant id then https://github.com/Azure/azure-devops-cli-extension/blob/b3d0392d597a2eae5229e96059359d00fbb2e222/azure-devops/azext_devops/dev/common/services.py#L123-L128 logic of that code could be simplified and looping all available tenants is not needed

Request tenant id from Azure Devops service

ADO API returns tenant id with 403 replies

The following headers are returned from ADO:

Looping through all tenant information is not needed if tenant information is fetch from the ADO API.

Removal of token validation

Removal of the tenant loops


(1) https://github.com/Azure/azure-devops-cli-extension/blob/b3d0392d597a2eae5229e96059359d00fbb2e222/azure-devops/azext_devops/dev/common/services.py#L119 (2) this code was broken. Fix is in main but not released yet


Current state:

sequenceDiagram
    autonumber
    participant User
    participant Extension
    participant ADO_API
    participant CLI
    User->>Extension: run command
    Extension->>CLI: get subscriptions
    CLI-->>Extension: 
    %%Extension->>Extension: 
    %%activate Extension
    loop Validation: subscription/user pairs to find working token
        Extension->>CLI: Get token for sub/user
        CLI-->Extension: 
        Extension->>ADO_API: Try fetching data with token
        alt failure
        ADO_API-->>Extension: fail
        Note over ADO_API,Extension: continue loop
        else success 
        ADO_API-->>Extension: success
        Note over ADO_API,Extension: exit loop
        end
        opt After loop exhaustion
        Extension-->>User: Display AAD token error
        end
        opt After successfull token validation
        Extension->>ADO_API: Fetch data
        ADO_API-->>Extension: 
        Extension-->>User: Display for ADO API request
        end
    end
    Note right of ADO_API: slow and unneeded loop

Future?:

sequenceDiagram
    autonumber
    participant User
    participant Extension
    participant ADO_API
    participant CLI
    User->>Extension: run command
    Extension->>ADO_API: get data from protected enpoint
    ADO_API-->>Extension: 
    Extension->>CLI: get subscriptions
    CLI-->>Extension: 
    Extension->Extension: find correct subscription/username pair for given tenant id
    Extension->>CLI: Get token for sub/user
    CLI-->Extension: 
    Extension->Extension: handle profile.get_raw_token() errors

    opt After getting token for given tenant
    Note over ADO_API,Extension: validation not really needed
    Extension->>ADO_API: Fetch data
    ADO_API-->>Extension: 
    Extension-->>User: Display for ADO API request
    end

    opt Token not found or other profile.get_raw_token() exception
    Extension-->>User: Display AAD token error
    end
jikuja commented 1 year ago

And even more things that could be classified as bugs:

login process does full Azure AD interragotion even when user has:

If user has large number of tenants in their az cli login credentials cache this will add huge delay before the actual API call is made.

In my use case those CLI has to make extra 8 queries to https://login.microsoftonline.com/ to fetch a token and other 8 extra calls to validate token by calling https://dev.azure.com:443/{organization}/_apis/projects?stateFilter=all&$top=1&$skip=0 even I want to use PAT.

Does this logic make sense? Is there any documentation how current authentication code is supposed to work?