Azure / cli

Automate your GitHub workflows using Azure CLI scripts
MIT License
125 stars 54 forks source link

Breaking change latest CLI version #56

Closed rvdlaarschot closed 2 years ago

rvdlaarschot commented 2 years ago

When running the Azure/cli@1.0.4 action with the latest CLI version (2.30.0 since today) in a workflow I get the following error despite having run the Azure/login@v1 action in a previous step in the same job:

ERROR: Could not retrieve credential from local cache for service principal *****. Run 'az login' for this service principal.

Reverting to the previous CLI version by passing the argument azcliversion: 2.29.2 to the Azure/cli@1.0.4 action solved the issue for me.

fhavrlent commented 2 years ago

Getting the same without passing the azcliversion and azure/cli@v1

marvinbuss commented 2 years ago

I can see the same error when using the latest Azure CLI version.

t-dedah commented 2 years ago

Hi 👋 There is already an issue opened for this error with azure cli team. Will also investigate further if something can be done on our side.

https://github.com/Azure/azure-cli/issues/20153

t-dedah commented 2 years ago

Hi @rvdlaarschot @marvinbuss @fhavrlent The issue is caused due to mismatch between az cli version on the agent and az cli version being used by azure/cli. This will resolve itself once all the agents have latest azcli version until then we will have to rely on specifically using version 2.29.2 in azure/cli.

Root cause

The is because Azure Login still uses the old ADAL-based Azure CLI 2.29.0 while Azure CLI Action uses the latest MSAL-based Azure CLI 2.30.0.

After the ADAL->MSAL migration (https://github.com/Azure/azure-cli/pull/19853), the latest Azure CLI is not compatible with old versions.

For more info- https://github.com/Azure/azure-cli/issues/20154#issuecomment-958615636

ajmarks commented 2 years ago

I'm guessing this issue with az login --identity is related:

2021-11-03T23:46:52.035327497Z ERROR: The command failed with an unexpected error. Here is the traceback:
2021-11-03T23:46:52.035837204Z ERROR: invalid literal for int() with base 10: '11/04/2021 23:46:50 +00:00'
2021-11-03T23:46:52.035860304Z Traceback (most recent call last):
2021-11-03T23:46:52.035866204Z   File "/opt/az/lib/python3.6/site-packages/knack/cli.py", line 231, in invoke
2021-11-03T23:46:52.035870904Z     cmd_result = self.invocation.execute(args)
2021-11-03T23:46:52.035875104Z   File "/opt/az/lib/python3.6/site-packages/azure/cli/core/commands/__init__.py", line 657, in execute
2021-11-03T23:46:52.035879604Z     raise ex
2021-11-03T23:46:52.035883904Z   File "/opt/az/lib/python3.6/site-packages/azure/cli/core/commands/__init__.py", line 720, in _run_jobs_serially
2021-11-03T23:46:52.035888304Z     results.append(self._run_job(expanded_arg, cmd_copy))
2021-11-03T23:46:52.035892404Z   File "/opt/az/lib/python3.6/site-packages/azure/cli/core/commands/__init__.py", line 691, in _run_job
2021-11-03T23:46:52.035896704Z     result = cmd_copy(params)
2021-11-03T23:46:52.035900804Z   File "/opt/az/lib/python3.6/site-packages/azure/cli/core/commands/__init__.py", line 328, in __call__
2021-11-03T23:46:52.035905105Z     return self.handler(*args, **kwargs)
2021-11-03T23:46:52.035909305Z   File "/opt/az/lib/python3.6/site-packages/azure/cli/core/commands/command_operation.py", line 121, in handler
2021-11-03T23:46:52.035913605Z     return op(**command_args)
2021-11-03T23:46:52.035917605Z   File "/opt/az/lib/python3.6/site-packages/azure/cli/command_modules/profile/custom.py", line 128, in login
2021-11-03T23:46:52.035921905Z     return profile.login_with_managed_identity(username, allow_no_subscriptions)
2021-11-03T23:46:52.035926205Z   File "/opt/az/lib/python3.6/site-packages/azure/cli/core/_profile.py", line 244, in login_with_managed_identity
2021-11-03T23:46:52.035930505Z     subscriptions = subscription_finder.find_using_specific_tenant(tenant, msi_creds)
2021-11-03T23:46:52.035934705Z   File "/opt/az/lib/python3.6/site-packages/azure/cli/core/_profile.py", line 822, in find_using_specific_tenant
2021-11-03T23:46:52.035939005Z     for s in subscriptions:
2021-11-03T23:46:52.035943105Z   File "/opt/az/lib/python3.6/site-packages/azure/core/paging.py", line 129, in __next__
2021-11-03T23:46:52.035947405Z     return next(self._page_iterator)
2021-11-03T23:46:52.035951405Z   File "/opt/az/lib/python3.6/site-packages/azure/core/paging.py", line 76, in __next__
2021-11-03T23:46:52.035970405Z     self._response = self._get_next(self.continuation_token)
2021-11-03T23:46:52.035974705Z   File "/opt/az/lib/python3.6/site-packages/azure/mgmt/resource/subscriptions/v2019_11_01/operations/_subscriptions_operations.py", line 224, in get_next
2021-11-03T23:46:52.035979206Z     pipeline_response = self._client._pipeline.run(request, stream=False, **kwargs)
2021-11-03T23:46:52.035984206Z   File "/opt/az/lib/python3.6/site-packages/azure/core/pipeline/_base.py", line 211, in run
2021-11-03T23:46:52.035988506Z     return first_node.send(pipeline_request)  # type: ignore
2021-11-03T23:46:52.035992706Z   File "/opt/az/lib/python3.6/site-packages/azure/core/pipeline/_base.py", line 71, in send
2021-11-03T23:46:52.035996906Z     response = self.next.send(request)
2021-11-03T23:46:52.036000906Z   File "/opt/az/lib/python3.6/site-packages/azure/core/pipeline/_base.py", line 71, in send
2021-11-03T23:46:52.036005106Z     response = self.next.send(request)
2021-11-03T23:46:52.036009106Z   File "/opt/az/lib/python3.6/site-packages/azure/core/pipeline/_base.py", line 71, in send
2021-11-03T23:46:52.036013306Z     response = self.next.send(request)
2021-11-03T23:46:52.036017406Z   [Previous line repeated 2 more times]
2021-11-03T23:46:52.036021506Z   File "/opt/az/lib/python3.6/site-packages/azure/mgmt/core/policies/_base.py", line 47, in send
2021-11-03T23:46:52.036025706Z     response = self.next.send(request)
2021-11-03T23:46:52.036029806Z   File "/opt/az/lib/python3.6/site-packages/azure/core/pipeline/policies/_redirect.py", line 158, in send
2021-11-03T23:46:52.036034306Z     response = self.next.send(request)
2021-11-03T23:46:52.036038406Z   File "/opt/az/lib/python3.6/site-packages/azure/core/pipeline/policies/_retry.py", line 445, in send
2021-11-03T23:46:52.036042606Z     response = self.next.send(request)
2021-11-03T23:46:52.036046706Z   File "/opt/az/lib/python3.6/site-packages/azure/core/pipeline/policies/_authentication.py", line 117, in send
2021-11-03T23:46:52.036051006Z     self.on_request(request)
2021-11-03T23:46:52.036055007Z   File "/opt/az/lib/python3.6/site-packages/azure/core/pipeline/policies/_authentication.py", line 94, in on_request
2021-11-03T23:46:52.036059307Z     self._token = self._credential.get_token(*self._scopes)
2021-11-03T23:46:52.036063607Z   File "/opt/az/lib/python3.6/site-packages/azure/cli/core/auth/adal_authentication.py", line 41, in get_token
2021-11-03T23:46:52.036067907Z     return AccessToken(self.token['access_token'], int(self.token['expires_on']))
2021-11-03T23:46:52.036072107Z ValueError: invalid literal for int() with base 10: '11/04/2021 23:46:50 +00:00'
2021-11-03T23:46:52.126117700Z To open an issue, please run: 'az feedback'
vermegi commented 2 years ago

Hi All, I think this is the 3rd time (or at least the second) in the last couple of months where a mismatch in CLI version between the runner and this CLI task has made workflows fail. Could something be done to make this more reliable since the versions tend to update apart from each other on this task and on the runners.

t-dedah commented 2 years ago

Hi @vermegi We have raised a PR for new input which if set to true will restrict latest to version available on the agent in order to avoid any mismatch. https://github.com/Azure/cli/pull/57 Let us know if you have any comments on the solution.

vermegi commented 2 years ago

@t-dedah yes, that would be an acceptable fix.

cinderashhh commented 2 years ago

We had this issue a week ago and set the azcliversion to 2.0.75. This worked for a week. Suddenly, we faced another issue today (the exact time should be one hour ago), it gave another error message: image Tried to add the az account set command with no luck...

sazarubin commented 2 years ago

@cinderashhh, https://github.com/Azure/cli/pull/57 changed default CLI version to the version on agent, so to use az-cli action with az login, you need to unpin az CLI version in the action (at least temporarily, until both login and CLI actions support version pinning)

pdebruin commented 2 years ago

What worked for me was to run az version on azure/loginv1, which returned 2.30.0 (YMMV) and use that as azcliversion like:

t-dedah commented 2 years ago

@marvinbuss @rvdlaarschot @vermegi @ajmarks @pdebruin @sazarubin

  1. Released a long term fix for any further mismatch issues. Now the default value for azcliversion dynamically points to the version installed on agent. So there will be no mismatch again unless someone explicitly mentions latest. If for some reason there is no version of az cli on the agent then action fall backs to latest.

  2. Most of the hosted agents are also updated to 2.30.0.

Please test your scenarios and let us know if you face any more issues.

t-dedah commented 2 years ago

@cinderashhh Can you please open a new issue for your error?

Jandev commented 2 years ago

@marvinbuss @rvdlaarschot @vermegi @ajmarks @pdebruin @sazarubin

  1. Released a long term fix for any further mismatch issues. Now the default value for azcliversion dynamically points to the version installed on agent. So there will be no mismatch again unless someone explicitly mentions latest. If for some reason there is no version of az cli on the agent then action fall backs to latest.
  2. Most of the hosted agents are also updated to 2.30.0.

Please test your scenarios and let us know if you face any more issues.

I stumbled across this issue just now and see you have created a fix 2 hours ago. Should this already be available, seeing I saw this issue ~20 minutes ago.

t-dedah commented 2 years ago

@Jandev Can you please share the workflow you are using? If its confidential then you can share it via email at t-dedah@github.com

Jandev commented 2 years ago

I found out one of the tasks still did a azcliversion: 2.23.0, so that was causing the error. Removed it and all steps appear to be working fine now.

restfulhead commented 2 years ago

@cinderashhh, #57 changed default CLI version to the version on agent, so to use az-cli action with az login, you need to unpin az CLI version in the action (at least temporarily, until both login and CLI actions support version pinning)

Ok, but how do we pin the Azure CLI version? It wouldn't be the first time that a newer CLI version breaks things. We need predictable builds...

t-dedah commented 2 years ago

@restfulhead You can still pin using azcliversion to any version. But not setting it up on your own will make sure the action will always pick the version already on the agent so there will be no mismatch.

restfulhead commented 2 years ago

@t-dedah Could you please elaborate how to do that? I'm facing the same issue as the author. Here's the config that used to work, pinning to 2.28.0. This suddenly broke a few days ago without changing anything.

      - uses: Azure/login@v1.1
        with:
          creds: ...

      - name: Upload to blob storage
        uses: azure/CLI@v1
        with:
          azcliversion: 2.28.0
          inlineScript: |
              az storage blob upload-batch ...
t-dedah commented 2 years ago

@restfulhead I assume you are using GitHub Hosted Agent which got updated to 2.30.0 az cli version this week. And there is a breaking change from version 2.29.2 to 2.30.0, so all the workflows using azure/login on GitHub Hosted Agent will fail with any version below 2.30.0.

In order to fix you problem either set azcliversion to 2.30.0

      - name: Upload to blob storage
        uses: azure/CLI@v1
        with:
          azcliversion: 2.30.0
          inlineScript: |
              az storage blob upload-batch ...

or leave it to default

      - name: Upload to blob storage
        uses: azure/CLI@v1
        with:
          inlineScript: |
              az storage blob upload-batch ...

In default case, action will automatically choose azcliversion which is already installed on the agent so that there will be no mismatch, even if we have any further breaking changes in future.

restfulhead commented 2 years ago

@t-dedah Thank you for the details. Upgrading to 2.30.0 works, yes. However, I am still wondering what's the best way for the future. If I don't pin the version, then there's the risk that a newer cli version could break things in our workflow. So I would like to be able to pin, however, doing this produced the issue here.

I have a question based on the following, earlier comment.

"Azure Login still uses the old ADAL-based Azure CLI 2.29.0 while Azure CLI Action"

So the login action uses the cli version that comes with the hosted agent? If so, would it be feasible to add an option to pin also the login action? Then we could avoid the miss-match and would be independent from the hosted agent. What do you think?

t-dedah commented 2 years ago

@restfulhead Azure/login is under a separate team so you can raise an issue there but as per current understanding we cant pin a version on Azure/login as it directly uses the agent version

Yes you have a valid point that in future some version can still break the actions, but the above fix makes sure its not atleast due to mismatch between version on agent and the version being used in azure/cli.

dkirrane commented 2 years ago

It looks like this change has broken the Azure ansible-collections also.

For example the following no longer works for me.

  azure_rm_postgresqlfirewallrule:
    auth_source: cli

I've raised separately here https://github.com/ansible-collections/azure/issues/688

t-dedah commented 2 years ago

@dkirrane Unfortunately we dont own that action so we can't help you on this one.

vermegi commented 2 years ago

I just retested with leaving out the azcliversion parameter and can confirm it works for my previously failing workflow. Code at https://github.com/Azure-Samples/azure-spring-cloud-blue-green/blob/main/.github/workflows/blue-green-deploy.yml

t-dedah commented 2 years ago

Thank you @vermegi for the confirmation.

restfulhead commented 2 years ago

@restfulhead Azure/login is under a separate team so you can raise an issue there but as per current understanding we cant pin a version on Azure/login as it directly uses the agent version

Ok, thanks for confirming. Someone else already created an issue. For anyone following/interested, please upvote https://github.com/Azure/login/issues/164.

jiasli commented 2 years ago

@ajmarks, your ValueError: invalid literal for int() with base 10: '11/04/2021 23:46:50 +00:00' issue has been fixed by https://github.com/Azure/azure-cli/pull/20219 and the fix will be released in Azure CLI 2.31.0.

t-dedah commented 2 years ago

Closing this issue as a fixed was released 2 weeks ago.