Azure / azqr

Azure Quick Review
https://azure.github.io/azqr
MIT License
489 stars 77 forks source link

Preview 2.0 Panic error #248

Closed red-erik closed 1 month ago

red-erik commented 1 month ago

Hello, just tried and received this Last line was: INF Scanning subscriptions for Microsoft.SignalRService/webPubSub

panic: interface conversion: interface {} is map[string]interface {}, not string

goroutine 80 [running]: github.com/Azure/azqr/internal.AprlScanner.graphScan({}, {0x2431e68, 0xc0000916d0}, 0xc00040e168, {0xc000352b40, 0xc, 0x0?}, 0xc000279350) D:/a/azqr/azqr/internal/aprl_scanner.go:198 +0xb94 github.com/Azure/azqr/internal.AprlScanner.Scan.func2({0xc000352b40?, 0xc?, 0x8f?}, 0xa9aa0a9aa0a9aa0?) D:/a/azqr/azqr/internal/aprl_scanner.go:154 +0x125 created by github.com/Azure/azqr/internal.AprlScanner.Scan in goroutine 1 D:/a/azqr/azqr/internal/aprl_scanner.go:146 +0x254

Regards, Red.

cmendible commented 1 month ago

Hi thanks for reporting.

Can you please set the AZURE_SDK_GO_LOGGING environment variable to all and run the tool with the --debug flag and share the details?

red-erik commented 1 month ago

Hello, it seems to be quite the same:

[Aug 20 11:36:11.379389] Retry: response 200 [Aug 20 11:36:11.380006] Retry: exit due to non-retriable status code panic: runtime error: index out of range [2] with length 1

goroutine 58 [running]: github.com/Azure/azqr/internal/azqr.GetSubsctiptionFromResourceID(...) D:/a/azqr/azqr/internal/azqr/azqr.go:274 github.com/Azure/azqr/internal.AprlScanner.graphScan({}, {0x2431e68, 0xc000156050}, 0xc00010c2e0, {0xc000438b40, 0xc, 0x0?}, 0xc0002f8870) D:/a/azqr/azqr/internal/aprl_scanner.go:223 +0xb37 github.com/Azure/azqr/internal.AprlScanner.Scan.func2({0xc000438b40?, 0xc?, 0x8f?}, 0xffffffffffffffff?) D:/a/azqr/azqr/internal/aprl_scanner.go:154 +0x125 created by github.com/Azure/azqr/internal.AprlScanner.Scan in goroutine 1 D:/a/azqr/azqr/internal/aprl_scanner.go:146 +0x254 Regards, Red.

red-erik commented 1 month ago

We have RESPONSE Status: 429 Too Many Requests so I suppose it's related to that https://learn.microsoft.com/en-us/graph/throttling https://learn.microsoft.com/en-us/azure/architecture/patterns/throttling

Regards, Red.

cmendible commented 1 month ago

Hello, it seems to be quite the same:

[Aug 20 11:36:11.379389] Retry: response 200 [Aug 20 11:36:11.380006] Retry: exit due to non-retriable status code panic: runtime error: index out of range [2] with length 1

goroutine 58 [running]: github.com/Azure/azqr/internal/azqr.GetSubsctiptionFromResourceID(...) D:/a/azqr/azqr/internal/azqr/azqr.go:274 github.com/Azure/azqr/internal.AprlScanner.graphScan({}, {0x2431e68, 0xc000156050}, 0xc00010c2e0, {0xc000438b40, 0xc, 0x0?}, 0xc0002f8870) D:/a/azqr/azqr/internal/aprl_scanner.go:223 +0xb37 github.com/Azure/azqr/internal.AprlScanner.Scan.func2({0xc000438b40?, 0xc?, 0x8f?}, 0xffffffffffffffff?) D:/a/azqr/azqr/internal/aprl_scanner.go:154 +0x125 created by github.com/Azure/azqr/internal.AprlScanner.Scan in goroutine 1 D:/a/azqr/azqr/internal/aprl_scanner.go:146 +0x254 Regards, Red.

This seems to be yet another issue. If you run with the env variable and the --debug flag you should see the ARG query that causes the issues. Can you share please share it?

cmendible commented 1 month ago

We have RESPONSE Status: 429 Too Many Requests so I suppose it's related to that https://learn.microsoft.com/en-us/graph/throttling https://learn.microsoft.com/en-us/azure/architecture/patterns/throttling

Regards, Red.

Throttling is been handled and is not the cause of these Panic errors

red-erik commented 1 month ago

Hello, don't know what you exactly mean (sorry, I'm only a user and not a developer) but, last lines before the error are:

[Aug 20 11:36:10.816275] Retry: response 429 [Aug 20 11:36:10.816275] Retry: End Try #1, Delay=1s [Aug 20 11:36:11.378879] Response: ==> REQUEST/RESPONSE (Try=1/829.39ms, OpTime=829.39ms) -- RESPONSE RECEIVED POST https://management.azure.com/providers/Microsoft.ResourceGraph/resources?api-version=2021-06-01-preview Accept: application/json Authorization: REDACTED Content-Length: 2242 Content-Type: application/json User-Agent: azsdk-go-armresourcegraph/v0.9.0 (go1.22.0; Windows_NT)

RESPONSE Status: 200 OK Cache-Control: no-cache Content-Length: 275362 Content-Type: application/json; charset=utf-8 Date: Tue, 20 Aug 2024 09:36:10 GMT Expires: -1 Pragma: no-cache Strict-Transport-Security: REDACTED X-Cache: REDACTED X-Content-Type-Options: REDACTED X-Ms-Correlation-Request-Id: REDACTED X-Ms-Ratelimit-Remaining-Tenant-Reads: REDACTED X-Ms-Ratelimit-Remaining-Tenant-Resource-Requests: REDACTED X-Ms-Request-Id: 116da4f0-70aa-4944-8074-8129e12ce11d X-Ms-Resource-Graph-Request-Duration: REDACTED X-Ms-Routing-Request-Id: REDACTED X-Ms-User-Quota-Remaining: REDACTED X-Ms-User-Quota-Resets-After: REDACTED X-Msedge-Ref: REDACTED

Regards, Red.

cmendible commented 1 month ago

Hmm sorry about that. If you use the --debug flag (i.e. azqr scan --debug) you should see messages like:

DBG // Azure Resource Graph Query
// This resource graph query will return all storage accounts that does not have a Private Endpoint Connection or where a private endpoint exists but public access is enabled
resources
| where type =~ "Microsoft.Storage/StorageAccounts"
| where isnull(properties.privateEndpointConnections) or properties.privateEndpointConnections[0].properties.provisioningState != ("Succeeded") or (isnull(properties.networkAcls) and properties.publicNetworkAccess == 'Enabled')
| extend param1 = strcat('Private Endpoint: ', iif(isnotnull(properties.privateEndpointConnections),split(properties.privateEndpointConnections[0].properties.privateEndpoint.id,'/')[8],'No Private Endpoint'))
| extend param2 = strcat('Access: ', iif(properties.publicNetworkAccess == 'Disabled', 'Public Access Disabled', iif(isnotnull(properties.networkAcls), 'NetworkACLs in place','Public Access Enabled')))
| project recommendationId = "dc55be60-6f8c-461e-a9d5-a3c7686ed94e", name, id, tags, param1, param2

If possible share with me the Azure Resource Graph Query that shows right before the panic error.

Thanks for your patience and help!

red-erik commented 1 month ago

Hello, it should be this one:

[90m2024-08-20T12:40:03+02:00 DBG // Azure Resource Graph Query // Find Subnets with Service Endpoint enabled for services that offer Private Link resources | where type =~ 'Microsoft.Network/virtualnetworks' | mv-expand subnets = properties.subnets | extend se = array_length(subnets.properties.serviceEndpoints) | where se >= 1 | project name, id, tags, subnets, serviceEndpoints=todynamic(subnets.properties.serviceEndpoints) | mv-expand serviceEndpoints | project name, id, tags, subnetName=subnets.name, serviceName=tostring(serviceEndpoints.service) | where serviceName in (parse_json('["Microsoft.CognitiveServices","Microsoft.AzureCosmosDB","Microsoft.DBforMariaDB","Microsoft.DBforMySQL","Microsoft.DBforPostgreSQL","Microsoft.EventHub","Microsoft.KeyVault","Microsoft.ServiceBus","Microsoft.Sql", "Microsoft.Storage","Microsoft.StorageSync","Microsoft.Synapse","Microsoft.Web"]')) | project recommendationId = "24ae3773-cc2c-3649-88de-c9788e25b463", name, id, tags, param1 = strcat("subnet=", subnetName), param2=strcat("serviceName=",serviceName), param3="ServiceEndpoints=true"

but not sure since I don't see it on session screen but only using > debug.txt (where I don't see the panic error)

The command line is: .\azqr2_0.exe scan --mask=false -f --debug > debug.txt

The env is quite huge

image

Regards, Red.

cmendible commented 1 month ago

Just released: v.2.0.0-preview.4

Please check if it fixes the issue for you.

red-erik commented 1 month ago

seems to be running, but very slow

image

Keep you updated.

2024-08-20T14:41:07+02:00 FTL Failed to get diagnostic settings error="Post \"https://management.azure.com/batch?api-version=2020-06-01\": dial tcp [2603:1030:a0c::10]:443: bind: An operation on a socket could not be performed because the system lacked sufficient buffer space or because a queue was full."

Running again with debug option

Red.

cmendible commented 1 month ago

Now it seems that due to the huge size of your environment, azqr is attempting more than 13.000 diagnostics settings queries in parallel and failing.

Let me see what I can do to fix this.

In the mean time please use the -s flag to set a subscription Id and if needed the -g flag to specify a resource group in order to reduce the number of services.

cmendible commented 1 month ago

@red-erik closing this one since the panic error was fixed.

Diagnostic Settings issue for large environments will be tracked here: https://github.com/Azure/azqr/issues/249