Azure / azure-cli

Azure Command-Line Interface
MIT License
3.91k stars 2.88k forks source link

`az network vnet list -g rg_name` returns different results when executed on different machines #17293

Open taurit opened 3 years ago

taurit commented 3 years ago

Describe the bug The command az network vnet list -g {rgName} sometimes repeatedly returns incorrect value (empty array, []). The same command executed on another machine at the same time for the same resource (e.g. from Cloud Shell in Azure Portal) returns correct, non-empty result.

To Reproduce This might be complex to reproduce in practice. I think that the correlation ids from logs might be more useful. But the scenario is: 1) Deploy AKS cluster using ARM template. Set the parameter nodeResourceGroup to some unique value, eg. resourcegroup-123. 2) Shortly after the cluster is deployed, run az network vnet list -g resourcegroup-123 3) The response often is empty array [], even if virtual network already exists and is accessible from other instances of az cli or Azure Portal. 4) Retry several times. The response can often be [] for 40 minutes of retries or longer. Then it can start responding properly, or our pipeline times out.

Expected behavior The command should return the same list of virtual networks in the resource group, regardless of the instance of az cli we're using to issue query.

Environment summary

Additional context I ran az network vnet list -g {rgName} --debug (with the --debug parameter). Here is the interesting part that shows the response is insonsistent across clients:

Output from Azure Pipelines:

(...)
DEBUG: azure.core.pipeline.policies._universal: Request URL: 'https://management.azure.com/subscriptions/ec0308fb-8d6e-497a-9dd6-99c7d9ab7962/resourceGroups/unicorn-dev-euw-aksresources-rg/providers/Microsoft.Network/virtualNetworks?api-version=2020-08-01'
DEBUG: azure.core.pipeline.policies._universal: Request method: 'GET'
DEBUG: azure.core.pipeline.policies._universal: Request headers:
DEBUG: azure.core.pipeline.policies._universal:     'Accept': 'application/json'
DEBUG: azure.core.pipeline.policies._universal:     'x-ms-client-request-id': 'ffc98a5f-830f-11eb-bcd3-49af26ce1415'
DEBUG: azure.core.pipeline.policies._universal:     'CommandName': 'network vnet list'
DEBUG: azure.core.pipeline.policies._universal:     'ParameterSetName': '-g --debug'
DEBUG: azure.core.pipeline.policies._universal:     'User-Agent': 'AZURECLI/2.20.0 (DEB) azsdk-python-azure-mgmt-network/17.1.0 Python/3.6.10 (Linux-5.4.0-1040-azure-x86_64-with-debian-bullseye-sid) VSTS_4aa70140-adae-4225-af61-2e88983101cc_build_670_0'
DEBUG: azure.core.pipeline.policies._universal:     'Authorization': '*****'
DEBUG: azure.core.pipeline.policies._universal: Request body:
DEBUG: azure.core.pipeline.policies._universal: This request has no body
DEBUG: urllib3.connectionpool: Starting new HTTPS connection (1): management.azure.com:443
DEBUG: urllib3.connectionpool: https://management.azure.com:443 "GET /subscriptions/ec0308fb-8d6e-497a-9dd6-99c7d9ab7962/resourceGroups/unicorn-dev-euw-aksresources-rg/providers/Microsoft.Network/virtualNetworks?api-version=2020-08-01 HTTP/1.1" 200 133
DEBUG: azure.core.pipeline.policies._universal: Response status: 200
DEBUG: azure.core.pipeline.policies._universal: Response headers:
DEBUG: azure.core.pipeline.policies._universal:     'Cache-Control': 'no-cache'
DEBUG: azure.core.pipeline.policies._universal:     'Pragma': 'no-cache'
DEBUG: azure.core.pipeline.policies._universal:     'Content-Type': 'application/json; charset=utf-8'
DEBUG: azure.core.pipeline.policies._universal:     'Content-Encoding': 'gzip'
DEBUG: azure.core.pipeline.policies._universal:     'Expires': '-1'
DEBUG: azure.core.pipeline.policies._universal:     'Vary': 'Accept-Encoding'
DEBUG: azure.core.pipeline.policies._universal:     'x-ms-ratelimit-remaining-subscription-reads': '11996'
DEBUG: azure.core.pipeline.policies._universal:     'x-ms-request-id': '7ca9ef1a-8215-484d-b1f4-921ed6f727ab'
DEBUG: azure.core.pipeline.policies._universal:     'x-ms-correlation-request-id': '7ca9ef1a-8215-484d-b1f4-921ed6f727ab'
DEBUG: azure.core.pipeline.policies._universal:     'x-ms-routing-request-id': 'NORTHEUROPE:20210312T085031Z:7ca9ef1a-8215-484d-b1f4-921ed6f727ab'
DEBUG: azure.core.pipeline.policies._universal:     'Strict-Transport-Security': 'max-age=31536000; includeSubDomains'
DEBUG: azure.core.pipeline.policies._universal:     'X-Content-Type-Options': 'nosniff'
DEBUG: azure.core.pipeline.policies._universal:     'Date': 'Fri, 12 Mar 2021 08:50:30 GMT'
DEBUG: azure.core.pipeline.policies._universal:     'Content-Length': '133'
DEBUG: azure.core.pipeline.policies._universal: Response content:
DEBUG: azure.core.pipeline.policies._universal: {"value":[]}
(...)

Output from Cloud Shell in Azure Portal:

(...)
DEBUG: azure.core.pipeline.policies._universal: Request URL: 'https://management.azure.com/subscriptions/ec0308fb-8d6e-497a-9dd6-99c7d9ab7962/resourceGroups/unicorn-dev-euw-aksresources-rg/providers/Microsoft.Network/virtualNetworks?api-version=2020-08-01'
DEBUG: azure.core.pipeline.policies._universal: Request method: 'GET'
DEBUG: azure.core.pipeline.policies._universal: Request headers:
DEBUG: azure.core.pipeline.policies._universal:     'Accept': 'application/json'
DEBUG: azure.core.pipeline.policies._universal:     'x-ms-client-request-id': '1aa7c576-830f-11eb-a13f-0a580af4c047'
DEBUG: azure.core.pipeline.policies._universal:     'CommandName': 'network vnet list'
DEBUG: azure.core.pipeline.policies._universal:     'ParameterSetName': '-g --debug'
DEBUG: azure.core.pipeline.policies._universal:     'User-Agent': 'AZURECLI/2.20.0 (DEB) azsdk-python-azure-mgmt-network/17.1.0 Python/3.6.10 (Linux-4.15.0-1108-azure-x86_64-with-debian-10.2) cloud-shell/1.0'
DEBUG: azure.core.pipeline.policies._universal:     'Authorization': '*****'
DEBUG: azure.core.pipeline.policies._universal: Request body:
DEBUG: azure.core.pipeline.policies._universal: This request has no body
DEBUG: urllib3.connectionpool: Starting new HTTPS connection (1): management.azure.com:443
DEBUG: urllib3.connectionpool: https://management.azure.com:443 "GET /subscriptions/ec0308fb-8d6e-497a-9dd6-99c7d9ab7962/resourceGroups/unicorn-dev-euw-aksresources-rg/providers/Microsoft.Network/virtualNetworks?api-version=2020-08-01 HTTP/1.1" 200 None
DEBUG: azure.core.pipeline.policies._universal: Response status: 200
DEBUG: azure.core.pipeline.policies._universal: Response headers:
DEBUG: azure.core.pipeline.policies._universal:     'Cache-Control': 'no-cache'
DEBUG: azure.core.pipeline.policies._universal:     'Pragma': 'no-cache'
DEBUG: azure.core.pipeline.policies._universal:     'Transfer-Encoding': 'chunked'
DEBUG: azure.core.pipeline.policies._universal:     'Content-Type': 'application/json; charset=utf-8'
DEBUG: azure.core.pipeline.policies._universal:     'Content-Encoding': 'gzip'
DEBUG: azure.core.pipeline.policies._universal:     'Expires': '-1'
DEBUG: azure.core.pipeline.policies._universal:     'Vary': 'Accept-Encoding'
DEBUG: azure.core.pipeline.policies._universal:     'x-ms-request-id': '71fbdd59-3b3f-480a-b1f8-564014cff345'
DEBUG: azure.core.pipeline.policies._universal:     'x-ms-correlation-request-id': '14089693-8490-4d0a-a745-cc2519a00482'
DEBUG: azure.core.pipeline.policies._universal:     'x-ms-arm-service-request-id': '1d77dda7-70fb-4b68-a097-8b6736131fc1'
DEBUG: azure.core.pipeline.policies._universal:     'Strict-Transport-Security': 'max-age=31536000; includeSubDomains'
DEBUG: azure.core.pipeline.policies._universal:     'Server': 'Microsoft-HTTPAPI/2.0, Microsoft-HTTPAPI/2.0'
DEBUG: azure.core.pipeline.policies._universal:     'x-ms-ratelimit-remaining-subscription-reads': '11996'
DEBUG: azure.core.pipeline.policies._universal:     'x-ms-routing-request-id': 'WESTEUROPE:20210312T084406Z:14089693-8490-4d0a-a745-cc2519a00482'
DEBUG: azure.core.pipeline.policies._universal:     'X-Content-Type-Options': 'nosniff'
DEBUG: azure.core.pipeline.policies._universal:     'Date': 'Fri, 12 Mar 2021 08:44:06 GMT'
DEBUG: azure.core.pipeline.policies._universal: Response content:
DEBUG: azure.core.pipeline.policies._universal: {
  "value": [
    {
      "name": "aks-vnet-85333569",
      "id": "/subscriptions/ec0308fb-8d6e-497a-9dd6-99c7d9ab7962/resourceGroups/unicorn-dev-euw-aksresources-rg/providers/Microsoft.Network/virtualNetworks/aks-vnet-85333569",
      "...": "... long JSON content here"
    }
  ]
(...)

Full output of az cli --debug is attached: 2021-03-12_08-35-24-run-from-azure-pipelines-returns-empty-array.txt 2021-03-12_08-44-06-run-from-azure-portal-cli-returns-correct-result.txt 2021-03-12_08-50-30-run-from-azure-pipelines-returns-empty-array-again.txt

yonzhan commented 3 years ago

network

msyyc commented 3 years ago

@service team. According to the debug info, the client send same request body with same rest api, but the response is different. Please pay attention to it.

ghost commented 3 years ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @aznetsuppgithub.

Issue Details
**Describe the bug** The command `az network vnet list -g {rgName}` sometimes repeatedly returns incorrect value (empty array, `[]`). The same command executed on another machine at the same time for the same resource (e.g. from Cloud Shell in Azure Portal) returns correct, non-empty result. **To Reproduce** This might be complex to reproduce in practice. I think that the correlation ids from logs might be more useful. But the scenario is: 1) Deploy AKS cluster using ARM template. Set the parameter `nodeResourceGroup` to some unique value, eg. `resourcegroup-123`. 2) Shortly after the cluster is deployed, run `az network vnet list -g resourcegroup-123` 3) The response often is empty array `[]`, even if virtual network already exists and is accessible from other instances of `az cli` or Azure Portal. 4) Retry several times. The response can often be `[]` for 40 minutes of retries or longer. Then it can start responding properly, or our pipeline times out. **Expected behavior** The command should return the same list of virtual networks in the resource group, regardless of the instance of `az cli` we're using to issue query. **Environment summary** - Azure Pipelines using AzureCLI@1 task. ``` azure-cli 2.20.0 core 2.20.0 telemetry 1.0.6 Extensions: azure-devops 0.18.0 ``` **Additional context** I ran `az network vnet list -g {rgName} --debug` (with the `--debug` parameter). Here is the interesting part that shows the response is insonsistent across clients: Output from Azure Pipelines: ``` (...) DEBUG: azure.core.pipeline.policies._universal: Request URL: 'https://management.azure.com/subscriptions/ec0308fb-8d6e-497a-9dd6-99c7d9ab7962/resourceGroups/unicorn-dev-euw-aksresources-rg/providers/Microsoft.Network/virtualNetworks?api-version=2020-08-01' DEBUG: azure.core.pipeline.policies._universal: Request method: 'GET' DEBUG: azure.core.pipeline.policies._universal: Request headers: DEBUG: azure.core.pipeline.policies._universal: 'Accept': 'application/json' DEBUG: azure.core.pipeline.policies._universal: 'x-ms-client-request-id': 'ffc98a5f-830f-11eb-bcd3-49af26ce1415' DEBUG: azure.core.pipeline.policies._universal: 'CommandName': 'network vnet list' DEBUG: azure.core.pipeline.policies._universal: 'ParameterSetName': '-g --debug' DEBUG: azure.core.pipeline.policies._universal: 'User-Agent': 'AZURECLI/2.20.0 (DEB) azsdk-python-azure-mgmt-network/17.1.0 Python/3.6.10 (Linux-5.4.0-1040-azure-x86_64-with-debian-bullseye-sid) VSTS_4aa70140-adae-4225-af61-2e88983101cc_build_670_0' DEBUG: azure.core.pipeline.policies._universal: 'Authorization': '*****' DEBUG: azure.core.pipeline.policies._universal: Request body: DEBUG: azure.core.pipeline.policies._universal: This request has no body DEBUG: urllib3.connectionpool: Starting new HTTPS connection (1): management.azure.com:443 DEBUG: urllib3.connectionpool: https://management.azure.com:443 "GET /subscriptions/ec0308fb-8d6e-497a-9dd6-99c7d9ab7962/resourceGroups/unicorn-dev-euw-aksresources-rg/providers/Microsoft.Network/virtualNetworks?api-version=2020-08-01 HTTP/1.1" 200 133 DEBUG: azure.core.pipeline.policies._universal: Response status: 200 DEBUG: azure.core.pipeline.policies._universal: Response headers: DEBUG: azure.core.pipeline.policies._universal: 'Cache-Control': 'no-cache' DEBUG: azure.core.pipeline.policies._universal: 'Pragma': 'no-cache' DEBUG: azure.core.pipeline.policies._universal: 'Content-Type': 'application/json; charset=utf-8' DEBUG: azure.core.pipeline.policies._universal: 'Content-Encoding': 'gzip' DEBUG: azure.core.pipeline.policies._universal: 'Expires': '-1' DEBUG: azure.core.pipeline.policies._universal: 'Vary': 'Accept-Encoding' DEBUG: azure.core.pipeline.policies._universal: 'x-ms-ratelimit-remaining-subscription-reads': '11996' DEBUG: azure.core.pipeline.policies._universal: 'x-ms-request-id': '7ca9ef1a-8215-484d-b1f4-921ed6f727ab' DEBUG: azure.core.pipeline.policies._universal: 'x-ms-correlation-request-id': '7ca9ef1a-8215-484d-b1f4-921ed6f727ab' DEBUG: azure.core.pipeline.policies._universal: 'x-ms-routing-request-id': 'NORTHEUROPE:20210312T085031Z:7ca9ef1a-8215-484d-b1f4-921ed6f727ab' DEBUG: azure.core.pipeline.policies._universal: 'Strict-Transport-Security': 'max-age=31536000; includeSubDomains' DEBUG: azure.core.pipeline.policies._universal: 'X-Content-Type-Options': 'nosniff' DEBUG: azure.core.pipeline.policies._universal: 'Date': 'Fri, 12 Mar 2021 08:50:30 GMT' DEBUG: azure.core.pipeline.policies._universal: 'Content-Length': '133' DEBUG: azure.core.pipeline.policies._universal: Response content: DEBUG: azure.core.pipeline.policies._universal: {"value":[]} (...) ``` Output from Cloud Shell in Azure Portal: ``` (...) DEBUG: azure.core.pipeline.policies._universal: Request URL: 'https://management.azure.com/subscriptions/ec0308fb-8d6e-497a-9dd6-99c7d9ab7962/resourceGroups/unicorn-dev-euw-aksresources-rg/providers/Microsoft.Network/virtualNetworks?api-version=2020-08-01' DEBUG: azure.core.pipeline.policies._universal: Request method: 'GET' DEBUG: azure.core.pipeline.policies._universal: Request headers: DEBUG: azure.core.pipeline.policies._universal: 'Accept': 'application/json' DEBUG: azure.core.pipeline.policies._universal: 'x-ms-client-request-id': '1aa7c576-830f-11eb-a13f-0a580af4c047' DEBUG: azure.core.pipeline.policies._universal: 'CommandName': 'network vnet list' DEBUG: azure.core.pipeline.policies._universal: 'ParameterSetName': '-g --debug' DEBUG: azure.core.pipeline.policies._universal: 'User-Agent': 'AZURECLI/2.20.0 (DEB) azsdk-python-azure-mgmt-network/17.1.0 Python/3.6.10 (Linux-4.15.0-1108-azure-x86_64-with-debian-10.2) cloud-shell/1.0' DEBUG: azure.core.pipeline.policies._universal: 'Authorization': '*****' DEBUG: azure.core.pipeline.policies._universal: Request body: DEBUG: azure.core.pipeline.policies._universal: This request has no body DEBUG: urllib3.connectionpool: Starting new HTTPS connection (1): management.azure.com:443 DEBUG: urllib3.connectionpool: https://management.azure.com:443 "GET /subscriptions/ec0308fb-8d6e-497a-9dd6-99c7d9ab7962/resourceGroups/unicorn-dev-euw-aksresources-rg/providers/Microsoft.Network/virtualNetworks?api-version=2020-08-01 HTTP/1.1" 200 None DEBUG: azure.core.pipeline.policies._universal: Response status: 200 DEBUG: azure.core.pipeline.policies._universal: Response headers: DEBUG: azure.core.pipeline.policies._universal: 'Cache-Control': 'no-cache' DEBUG: azure.core.pipeline.policies._universal: 'Pragma': 'no-cache' DEBUG: azure.core.pipeline.policies._universal: 'Transfer-Encoding': 'chunked' DEBUG: azure.core.pipeline.policies._universal: 'Content-Type': 'application/json; charset=utf-8' DEBUG: azure.core.pipeline.policies._universal: 'Content-Encoding': 'gzip' DEBUG: azure.core.pipeline.policies._universal: 'Expires': '-1' DEBUG: azure.core.pipeline.policies._universal: 'Vary': 'Accept-Encoding' DEBUG: azure.core.pipeline.policies._universal: 'x-ms-request-id': '71fbdd59-3b3f-480a-b1f8-564014cff345' DEBUG: azure.core.pipeline.policies._universal: 'x-ms-correlation-request-id': '14089693-8490-4d0a-a745-cc2519a00482' DEBUG: azure.core.pipeline.policies._universal: 'x-ms-arm-service-request-id': '1d77dda7-70fb-4b68-a097-8b6736131fc1' DEBUG: azure.core.pipeline.policies._universal: 'Strict-Transport-Security': 'max-age=31536000; includeSubDomains' DEBUG: azure.core.pipeline.policies._universal: 'Server': 'Microsoft-HTTPAPI/2.0, Microsoft-HTTPAPI/2.0' DEBUG: azure.core.pipeline.policies._universal: 'x-ms-ratelimit-remaining-subscription-reads': '11996' DEBUG: azure.core.pipeline.policies._universal: 'x-ms-routing-request-id': 'WESTEUROPE:20210312T084406Z:14089693-8490-4d0a-a745-cc2519a00482' DEBUG: azure.core.pipeline.policies._universal: 'X-Content-Type-Options': 'nosniff' DEBUG: azure.core.pipeline.policies._universal: 'Date': 'Fri, 12 Mar 2021 08:44:06 GMT' DEBUG: azure.core.pipeline.policies._universal: Response content: DEBUG: azure.core.pipeline.policies._universal: { "value": [ { "name": "aks-vnet-85333569", "id": "/subscriptions/ec0308fb-8d6e-497a-9dd6-99c7d9ab7962/resourceGroups/unicorn-dev-euw-aksresources-rg/providers/Microsoft.Network/virtualNetworks/aks-vnet-85333569", "...": "... long JSON content here" } ] (...) ``` Full output of `az cli --debug` is attached: [2021-03-12_08-35-24-run-from-azure-pipelines-returns-empty-array.txt](https://github.com/Azure/azure-cli/files/6129714/2021-03-12_08-35-24-run-from-azure-pipelines-returns-empty-array.txt) [2021-03-12_08-44-06-run-from-azure-portal-cli-returns-correct-result.txt](https://github.com/Azure/azure-cli/files/6129715/2021-03-12_08-44-06-run-from-azure-portal-cli-returns-correct-result.txt) [2021-03-12_08-50-30-run-from-azure-pipelines-returns-empty-array-again.txt](https://github.com/Azure/azure-cli/files/6129716/2021-03-12_08-50-30-run-from-azure-pipelines-returns-empty-array-again.txt)
Author: taurit
Assignees: msyyc
Labels: `Network`, `Service Attention`
Milestone: S184
hollowdrutt commented 2 years ago

Any progress on this? I ran into the problem while trying to follow along the book Hands-On Kubernetes on Azure where you are instructed to run the commands:

nodeResourceGroup=$(az aks show -n handsonaks \
-g rg-handsonaks -o tsv --query "nodeResourceGroup")

aksVnetName=$(az network vnet list \
-g $nodeResourceGroup -o tsv --query "[0].name")

aksVnetId=$(az network vnet show -n $aksVnetName \
-g $nodeResourceGroup -o tsv --query "id")

az network vnet peering create \
-n AppGWtoAKSVnetPeering -g agic \
--vnet-name agic-vnet --remote-vnet $aksVnetId \
--allow-vnet-access

And it becomes quite confusing when it doesn't work and the vnet is visible in the Azure Portal but az network vnet list returns []. After 10 minutes or so the command started to return the correct result.

yonzhan commented 2 years ago

network service team should look into this

borgewi commented 2 years ago

Any progress here? We are experiencing the same problem. Empty list towards some resource groups and non-empty lists towards other resource groups

ivan-nushev commented 1 year ago

It was observed similar behaviour in a subscription of the company I am working for. In a time interval of 10 minutes we poll Azure VirtualNetworks API through Azure Java library version 1.41.3. Regular results of that poll contain all 25 VirtualNetworks but in an occasional result there were only 7. After the odd result the client continued to return all 25 VirtualNetworks. This behaviour is observed in rare cases but its impact is significant as our application logic performs some cleanup of VirtualNetworks application specific representations when there is discrepancy. Why does Azure library misbehave? What can we do to mitigate it? Is there a fix for this in newer library versions?

pawel-zolty commented 1 year ago

I have similar problem. I have done this exercise from az-700. I used az cli instead of azure portal to create resources. When I list vnets with az network vnet list --query "[[].name]" I can see:

[
  [
    "ManufacturingVnet",
    "ResearchVnet"
  ]
]

There is not CoreServicesVnet. In Azure Portal I can see three vnets. I am sure that 3 vnets exists

fals commented 1 year ago

I have the same issue when using the Azure SDK for Go, when creating a new cluster all resources are available on azure portal, the cluster works as intended, but if you try to list the vnets you get nothing from that cluster. Sometimes it takes Hours to have the correct result from the API.

shirsa commented 1 year ago

I have the same issue with: az network private-endpoint list When the request is routed through QATARCENTRAL the response is empty. When the request is routed through FRANCESOUTH the response is as expected and the response also contains x-ms-arm-service-request-id header

v-maxjohnson commented 2 months ago

Are you working on this @necusjz? This issue still occurs, and it has been open multiple years now.