Azure / azure-cli

Azure Command-Line Interface
MIT License
4.03k stars 3.01k forks source link

az vm image list can't find publisher #11767

Closed SamaraSoucy-MSFT closed 3 years ago

SamaraSoucy-MSFT commented 4 years ago

Describe the bug

Command Name az vm image list

Errors:

Publisher: Symantec.test.ru4mp1.latest was not found.

To Reproduce:

Steps to reproduce the behavior. Note that argument values have been redacted, as they may contain sensitive information.

Expected Behavior

Environment Summary

Windows-10-10.0.18362-SP0
Python 3.6.6
Shell: powershell.exe

azure-cli 2.0.74

Extensions:
azure-devops 0.11.0
dev-spaces 1.0.3
webapp 0.2.19

Additional Context

Original customer thread https://github.com/MicrosoftDocs/azure-docs-cli/issues/1732 I was able to reproduce it and I've sent an email with the debug log to the DL.

bim-msft commented 4 years ago

@qwordy Please take a look.

qwordy commented 4 years ago

I have reproduced the bug. Using '--debug', you can see this command can get lots of images, but in the end of the output, there is an error when retrieving image info from publisher Symantec.test.ru4mp1.latest. I think I may need to add some fault tolerance mechanism.

farialima commented 4 years ago

Having a similar issue :

bash-3.2$ az vm image list --offer Debian --all --location francecentral
Publisher: Microsoft.Azure.Extensions.Testb2e612e0-1c8c-40ed-93fe-a169ed7619cb-20190219154622 was not found.
bash-3.2$ 

This is problematic because it prevent discovering images. The only way to workaround it is to find (somehow) a publisher, and re-rerun with it, eg.

az vm image list --offer Debian --all --location francecentral --publisher credativ

Running with debug shows the following potential causes :

(...)
urllib3.connectionpool : https://management.azure.com:443 "GET /subscriptions/c0c11638-e31a-4ea3-815e-d33c8ef7e55b/providers/Microsoft.Compute/locations/francecentral/publishers/Microsoft.Az\
ure.Extensions.Testb2e612e0-1c8c-40ed-93fe-a169ed7619cb-20190219154622/artifacttypes/vmimage/offers?api-version=2019-07-01 HTTP/1.1" 404 175

(...)

msrest.http_logger :     'x-ms-request-id': '7401e2f6-5ab4-4aaf-8c81-41e86d9fb5da'

(...)

msrest.http_logger : Response content:
msrest.http_logger :     'x-ms-ratelimit-remaining-subscription-reads': '8268'
msrest.universal_http : Configuring request: timeout=100, verify=True, cert=None
msrest.http_logger :     'Transfer-Encoding': 'chunked'
msrest.http_logger : {                                                                                                                                                                         
  "error": {                                                                                                                                                                                   
    "code": "NotFound",                                                                                                                                                                        
    "message": "Publisher: Microsoft.Azure.Extensions.Testb2e612e0-1c8c-40ed-93fe-a169ed7619cb-20190219154622 was not found."                                                                  
  }                                                                                                                                                                                            
}

(...)

cli.azure.cli.core.util : Publisher: Microsoft.Azure.Extensions.Testb2e612e0-1c8c-40ed-93fe-a169ed7619cb-20190219154622 was not found.
Publisher: Microsoft.Azure.Extensions.Testb2e612e0-1c8c-40ed-93fe-a169ed7619cb-20190219154622 was not found.
az_command_data_logger : exit code: 1
telemetry.save : Save telemetry record of length 2553 in cache
telemetry.check : Returns Positive.
telemetry.main : Begin creating telemetry upload process.
telemetry.process : Creating upload process: "/usr/local/Cellar/azure-cli/2.0.79_2/libexec/bin/python /usr/local/Cellar/azure-cli/2.0.79_2/libexec/lib/python3.8/site-packages/azure/cli/telem\
etry/__init__.py /Users/francois/.azure"
telemetry.process : Return from creating process
telemetry.main : Finish creating telemetry upload process.
command ran in 343.665 seconds.
bash-3.2$ 
BhargaviAnnadevara commented 4 years ago

Similar issue reported here.

Running the command: az vm image list -f "Windows-10" --all -otable

returns: Publisher: Microsoft.Compute.TestSar was not found.

From what I see in the --debug logs, the location seems to be defaulting to northeurope when running from Azure Cloud Shell (vs probably a different location while running from a client).

I am consistently able to reproduce this error by passing northeurope as the --location (optional parameter) with the command regardless of where I run it from. I see that other locations like uksouth and centralindia work.

MrBill2U commented 4 years ago

While we wait for the code fix can someone please do a data fix to get rid of the reference to Symantec.test.ru4mp1.latest?

qwordy commented 4 years ago

Hi service team, could you help confirm and fix the data inconsistency problem? In Azure CLI, Step 1. use client.virtual_machine_images.list_publishers(location) to get publishers Step 2. use client.virtual_machine_images.list_offers(location, publisher) to get offers Step 3, 4, ...

Step 2 fails for some publishers retrieved from step 1

qwordy commented 4 years ago

I wrote a patch for this problem. Ignore the wrong data, output normal images. However, the data should be fixed.

qwordy commented 4 years ago

The problem should be resolved in Azure CLI. Download the latest version of Azure CLI. You can open a new issue if any issue.

jonathanbrenes commented 3 years ago

I found the same behavior again in the Azure Cloud Shell jonathan@Azure:~ az vm image list --location eastus --all --offer CentOS ResourceNotFoundError: (NotFound) Publisher: Symantec.test.ru4mp1.latest was not found. jonathan@Azure:~ az --version azure-cli 2.17.1

core 2.17.1 telemetry 1.0.6

Extensions: vm-repair 0.3.4 ai-examples 0.2.5 azure-cli-ml 1.19.0

Python location '/opt/az/bin/python3' Extensions directory '/home/jonathan/.azure/cliextensions' Extensions system directory '/opt/az/lib/python3.6/site-packages/azure-cli-extensions'

Python (Linux) 3.6.10 (default, Dec 31 2020, 08:28:38) [GCC 8.3.0]

Legal docs and information: aka.ms/AzureCliLegal

Your CLI is up-to-date.

Please let us know how we are doing: https://aka.ms/azureclihats and let us know if you're interested in trying out our newest features: https://aka.ms/CLIUXstudy

dcfsc commented 3 years ago

Similar issue with latest CLI on Linux with a limited scope

This works:

$ az vm extension image list 
[
  {
    "name": "AcronisBackup",
    "publisher": "Acronis.Backup",
    "version": "1.0.33"
  },
  {
    "name": "AcronisBackup",
    "publisher": "Acronis.Backup",
    "version": "1.0.51"
  },
...

A limited query reports an error and does not output anything

$ az vm extension image list --location eastus2 
ResourceNotFoundError: (NotFound) Publisher: Symantec.test.ru4mp1.latest was not found.

CLI version

$ az --version
azure-cli                         2.18.0

core                              2.18.0
telemetry                          1.0.6

Extensions:
azure-devops                      0.18.0

Python location '/usr/bin/python3'
Extensions directory '/home/dchwalis/.azure/cliextensions'

Python (Linux) 3.6.8 (default, Nov 16 2020, 16:55:22) 
[GCC 4.8.5 20150623 (Red Hat 4.8.5-44)]

Your CLI is up-to-date.
qwordy commented 3 years ago

Let me try it to see if I can reproduce

max-arnold commented 3 years ago

Same here:

% az vm extension image list --location westeurope -o table

ERROR: ResourceNotFoundError: (NotFound) Publisher: Microsoft.Azure.Extensions.Testb2e612e0-1c8c-40ed-93fe-a169ed7619cb-20190219154622 was not found.
% az --version

azure-cli                         2.18.0

core                              2.18.0
telemetry                          1.0.6

Python location '/Users/user/.virtualenvs/azure-cli/bin/python'
Extensions directory '/Users/user/.azure/cliextensions'

Python (Darwin) 3.7.9 (default, Sep  6 2020, 13:20:25)
[Clang 11.0.3 (clang-1103.0.32.62)]

Legal docs and information: aka.ms/AzureCliLegal

Your CLI is up-to-date.
edburns commented 3 years ago

I am now seeing this on the most basic of image queries:

az vm image list -l eastus --all 
Command group 'vm' is experimental and under development. Reference and support levels: https://aka.ms/CLI_refstatus
You are retrieving all the images from server which could take more than a minute. To shorten the wait, provide '--publisher', '--offer' or '--sku'. Partial name search is supported.
(NotFound) Publisher: Symantec.test.ru4mp1.latest was not found.

FWIW, I have filed ticket 2102150010003073 as well.

 az --version
azure-cli                         2.19.1

core                              2.19.1
telemetry                          1.0.6

Extensions:
resource-graph                     1.1.0
spring-cloud                       2.1.1

Python location '/opt/az/bin/python3'
Extensions directory '/home/edburns/.azure/cliextensions'

Python (Linux) 3.6.10 (default, Feb 10 2021, 05:17:43)
[GCC 7.5.0]

Legal docs and information: aka.ms/AzureCliLegal

Your CLI is up-to-date.

Please let us know how we are doing: https://aka.ms/azureclihats
and let us know if you're interested in trying out our newest features: https://aka.ms/CLIUXstudy
Drewm3 commented 3 years ago

@qwordy, could you clarify what service APIs are failing and what parameters are required to trigger the failure? From the list above it is unclear if this is an issue on the client, bad data from Marketplace, or an actual service bug.

@olayemio, please help route this to the appropriate dev team if the issue is actually a problem in the service API.

edburns commented 3 years ago

@Drewm3 @qwordy @olayemio FWIW, I have filed an IcM for this: 228055015 .

ghost commented 3 years ago

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @Drewm3, @vaibhav-agar.

Issue Details
Describe the bug **Command Name** `az vm image list` **Errors:** ``` Publisher: Symantec.test.ru4mp1.latest was not found. ``` ## To Reproduce: Steps to reproduce the behavior. Note that argument values have been redacted, as they may contain sensitive information. - _Put any pre-requisite steps here..._ - `az vm image list --offer {} --all --location {} --output {} --debug` (may only reproduce with the exact 'az vm image list --offer UbuntuServer --all --location eastus2' command ## Expected Behavior ## Environment Summary ``` Windows-10-10.0.18362-SP0 Python 3.6.6 Shell: powershell.exe azure-cli 2.0.74 Extensions: azure-devops 0.11.0 dev-spaces 1.0.3 webapp 0.2.19 ``` ## Additional Context Original customer thread https://github.com/MicrosoftDocs/azure-docs-cli/issues/1732 I was able to reproduce it and I've sent an email with the debug log to the DL.
Author: SamaraSoucy-MSFT
Assignees: qwordy, olayemio
Labels: `Compute - Images`, `Service Attention`, `bug`
Milestone: S166
Drewm3 commented 3 years ago

This appears to be a combination of some data corruption on the image metadata list as well as a bug in the APIs where the problem publishers are being returned. The dev team is investigating the issue and we'll update this thread as more details are known.

qwordy commented 3 years ago

I have reproduced the reported issues of both az vm image list and az vm extension image list. I fixed az vm image list before. I added try-except to make it more robust. However, a recent update of Azure Python SDK throws azure.core.exceptions.ResourceNotFoundError instead of msrestazure.azure_exceptions.CloudError. It is an epochal update from 12.0.0 to 17.0.0. https://pypi.org/project/azure-mgmt-compute/17.0.0b1/. So the error happens again. az vm extension image list is the same.

            try:
                skus = client.virtual_machine_images.list_skus(location, publisher, o.name)
            except CloudError as e:
                logger.warning(str(e))
                continue

It can be easily fixed. I am working on it.

qwordy commented 3 years ago

I wrote a PR to fix it. https://github.com/Azure/azure-cli/pull/16992. Release date is 3/2/2021.

qwordy commented 3 years ago

@Drewm3, Potential broken data:

az vm extension image list --location westeurope -o table
(NotFound) Publisher: Microsoft.Azure.Extensions.Testb2e612e0-1c8c-40ed-93fe-a169ed7619cb-20190219154622 was not found.
(NotFound) Publisher: Microsoft.Compute.TestSar was not found.
(NotFound) Publisher: Microsoft.SystemCenter.Test was not found.
(NotFound) Publisher: Symantec.test.ru2final was not found.
(NotFound) Publisher: Symantec.test.ru4mp1.latest was not found.

az vm image list --location eastus --all --offer CentOS -otable
(NotFound) Publisher: Symantec.test.ru4mp1.latest was not found.
(NotFound) Publisher: TrendMicro.DeepSecurity.Test was not found.

It fails when fetching offers given a publisher. APIs are defined here. https://docs.microsoft.com/en-us/rest/api/compute/virtualmachineimages Azure CLI code:

publishers = client.virtual_machine_images.list_publishers(location)
offers = client.virtual_machine_images.list_offers(location, publisher)
skus = client.virtual_machine_images.list_skus(location, publisher, o.name)
images = client.virtual_machine_images.list(location, publisher, o.name, s.name)

Complete code:

def load_images_thru_services(cli_ctx, publisher, offer, sku, location):
    from concurrent.futures import ThreadPoolExecutor, as_completed
    all_images = []
    client = _compute_client_factory(cli_ctx)
    if location is None:
        location = get_one_of_subscription_locations(cli_ctx)

    def _load_images_from_publisher(publisher):
        from azure.core.exceptions import ResourceNotFoundError
        try:
            offers = client.virtual_machine_images.list_offers(location, publisher)
        except ResourceNotFoundError as e:
            logger.warning(str(e))
            return
        if offer:
            offers = [o for o in offers if _matched(offer, o.name)]
        for o in offers:
            try:
                skus = client.virtual_machine_images.list_skus(location, publisher, o.name)
            except ResourceNotFoundError as e:
                logger.warning(str(e))
                continue
            if sku:
                skus = [s for s in skus if _matched(sku, s.name)]
            for s in skus:
                try:
                    images = client.virtual_machine_images.list(location, publisher, o.name, s.name)
                except ResourceNotFoundError as e:
                    logger.warning(str(e))
                    continue
                for i in images:
                    all_images.append({
                        'publisher': publisher,
                        'offer': o.name,
                        'sku': s.name,
                        'version': i.name})

    publishers = client.virtual_machine_images.list_publishers(location)
    if publisher:
        publishers = [p for p in publishers if _matched(publisher, p.name)]

    publisher_num = len(publishers)
    if publisher_num > 1:
        with ThreadPoolExecutor(max_workers=_get_thread_count()) as executor:
            tasks = [executor.submit(_load_images_from_publisher, p.name) for p in publishers]
            for t in as_completed(tasks):
                t.result()  # don't use the result but expose exceptions from the threads
    elif publisher_num == 1:
        _load_images_from_publisher(publishers[0].name)

    return all_images
jwromeo commented 3 years ago

Hey, I'm from the service team that's returning the inconsistent publisher data. We have identified 16 instances of the publisher data inconsistency across 17 regions. We are in the process of patching the data today to remove the inconsistency. Not all publishers are impacted in all regions, the range is 1-5 problem publishers per region.

Impacted Publishers: MICROSOFT.AZURE.EXTENSIONS.TEST94A55E3B-0448-4638-867C-6D01304F4BDF-20190219154622 SYMANTEC.TEST.RU2FINAL MICROSOFT.WINDOWSAZURE.COMPUTE.TEST MICROSOFT.TESTSQLSERVER.EDP MICROSOFT.AZURE.EXTENSIONS.TESTB2E612E0-1C8C-40ED-93FE-A169ED7619CB-20190219154622 MICROSOFT.AZURE.NETWORKWATCHER.EDP SYMANTEC.TEST.RU4MP1.LATEST MICROSOFT.SYSTEMCENTER.TEST TRENDMICRO.DEEPSECURITY.TEST MICROSOFT.AZURE.EXTENSIONS.TESTFE504E88-854F-46C7-9775-16CAC9BFCF28-20190219154622 SYMANTEC.TEST.RU4MP1 SYMANTEC.CLOUDWORKLOADPROTECTION.TEST MICROSOFT.COMPUTE.TESTSAR MICROSOFT.SYSTEMCENTER KASPERSKYLAB.SECURITYAGENT MICROSOFT.OSTCEXTENSIONS.EDP

Impacted Regions: WestEurope EastUS eastasia JapanWest EastUS2EUAP KoreaCentral FranceCentral CanadaCentral JapanEast NorthEurope southeastasia BrazilSouth CanadaEast AustraliaEast EastUS2 AustraliaCentral2 CentralUSEUAP

jwromeo commented 3 years ago

The publisher data inconsistency has been confirmed to be fixed now. We will investigate the root case and make the changes needed to prevent a persistent data inconsistency in the service.

WRT the current implementation, it is quite possible for the List All Publishers REST API to return a publisher for which the List Offers or List Extension Types API will return an error that the publisher is not found. Under normal conditions this state is transient and should correct itself within a minute and not be permanent as what happened here. It would be safe to assume that the named publisher does not have any available images or extensions in response to the PublisherNotFound error. So, the CLI code changes are also good changes to keep for this item to prevent transient failures.

qwordy commented 3 years ago

Hi @jwromeo , thank you for fixing it!