microsoft / ARI

Azure Resource Inventory - It's a Powerful tool to create EXCEL inventory from Azure Resources with low effort
MIT License
1.11k stars 396 forks source link

Still hanging with large env #227

Closed red-erik closed 3 weeks ago

red-erik commented 1 month ago

Hello, after 5 hours we are stuck here

Image

Regards, Red.

Claudio-Merola commented 1 month ago

Hi @red-erik

How much memory is the Powershell Processes consuming?

You should have run it with the -Debug, so we would know how many jobs are still running. is possible this is happening due to some specific module, like VMs, or AKS, then we could try to investigate it further.

red-erik commented 4 weeks ago

Hello, it seems to hang on Job 1

Large Size Environment Identified. Jobs will be run in batches to avoid CPU Overload. DEBUG: 2024-10-15_08_16_07 - Starting Processing Jobs in Large Mode. DEBUG: 2024-10-15_08_16_07 - Total Resources Being Processed: 71398 DEBUG: 2024-10-15_08_16_07 - Resources Being Processed in ResourceJob_1: 5000 DEBUG: 2024-10-15_08_16_08 - Resources Being Processed in ResourceJob_2: 5000 DEBUG: 2024-10-15_08_16_09 - Resources Being Processed in ResourceJob_3: 5000 DEBUG: 2024-10-15_08_16_10 - Resources Being Processed in ResourceJob_4: 5000 DEBUG: 2024-10-15_08_16_12 - Resources Being Processed in ResourceJob_5: 5000 DEBUG: 2024-10-15_08_16_12 - Waiting Batch of Jobs to Complete. DEBUG: 2024-10-15_08_16_13 - initial Jobs Running: 5 DEBUG: 2024-10-15_08_16_28 - initial Jobs Running: 5 DEBUG: 2024-10-15_08_16_43 - initial Jobs Running: 5 DEBUG: 2024-10-15_08_16_58 - initial Jobs Running: 5 DEBUG: 2024-10-15_08_17_13 - initial Jobs Running: 5 DEBUG: 2024-10-15_08_17_28 - initial Jobs Running: 5 DEBUG: 2024-10-15_08_17_43 - initial Jobs Running: 5 DEBUG: 2024-10-15_08_17_58 - initial Jobs Running: 5 DEBUG: 2024-10-15_08_18_14 - initial Jobs Running: 5 DEBUG: 2024-10-15_08_18_29 - initial Jobs Running: 5 DEBUG: 2024-10-15_08_18_44 - initial Jobs Running: 5 DEBUG: 2024-10-15_08_18_59 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_19_14 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_19_29 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_19_44 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_19_59 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_20_14 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_20_29 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_20_45 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_21_00 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_21_15 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_21_30 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_21_45 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_22_00 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_22_15 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_22_30 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_22_45 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_23_00 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_23_15 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_23_30 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_23_45 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_24_00 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_24_15 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_24_30 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_24_45 - initial Jobs Running: 1 DEBUG: 2024-10-15_08_25_00 - initial Jobs Running: 1 Azure Inventory [21% Complete. ] Processing Initial Resource Jobs [80% Complete. ]

Image

Regards, Red.

red-erik commented 4 weeks ago

Hello, still running, same status

DEBUG: 2024-10-15_12_44_09 - initial Jobs Running: 1 DEBUG: 2024-10-15_12_44_24 - initial Jobs Running: 1 DEBUG: 2024-10-15_12_44_39 - initial Jobs Running: 1 DEBUG: 2024-10-15_12_44_54 - initial Jobs Running: 1 Azure Inventory [21% Complete. ] Processing Initial Resource Jobs [80% Complete. ]

Regards, Red.

Claudio-Merola commented 4 weeks ago

Ok, I think we can assume it is getting stuck at some resource module.

What I would like to do now was to start removing the .ps1 files from the module folder to see what module file is dragging back the script, like if we remove the VM.ps1 and the script run way faster, it means this is the module causing the issue and we need to investigate what in the several loops there is causing the slow.

red-erik commented 4 weeks ago

Hello, renamed, just as a test, vm.ps1 and now it hangs here

DEBUG: 2024-10-16_11_32_08 - initial Jobs Running: 2 DEBUG: 2024-10-16_11_32_24 - initial Jobs Running: 2 DEBUG: 2024-10-16_11_32_39 - initial Jobs Running: 2 DEBUG: 2024-10-16_11_32_55 - initial Jobs Running: 2 DEBUG: 2024-10-16_11_33_10 - initial Jobs Running: 2 Azure Inventory [21% Complete. ] Processing Initial Resource Jobs [83% Complete. ]

Would not be possible to have an "enhanced debug" displaying what Job 2 is doing and which script is executing ? Otherwise we need to rename the scripts one by one, hoping to find the offending one.

Regards, Red.

Claudio-Merola commented 4 weeks ago

Actually no, Jobs exists within their on process and have little interaction with other processes.

The only way would be to actually create a log file for each process, and I believe it would be even more confuse to troubleshoot then renaming the files...

By the way, you don't need to do it one by one, you can remove entire folders of the .ps1 files to narrow down which module is causing the issue in the environment.

red-erik commented 3 weeks ago

Hello, I removed Network folder

DEBUG: 08:56:42 - [ConfigManager] Got [True] from [EnableDataCollection], Module = [], Cmdlet = []. DEBUG: 08:56:42 - GetAzureRMContextCommand end processing. DEBUG: 2024-10-17_08_56_45 - Creating Overall Panel. DEBUG: 2024-10-17_08_56_54 - Creating Charts. WARNING: Failed adding PivotTable 'P0': Cannot index into a null array. WARNING: Failed adding chart for pivotable 'P0': Cannot bind argument to parameter 'PivotTable' because it is null. WARNING: Failed adding PivotTable 'P2': Cannot index into a null array. WARNING: Failed adding chart for pivotable 'P2': Cannot bind argument to parameter 'PivotTable' because it is null. DEBUG: 2024-10-17_08_56_58 - Validating if Excel is installed (Extra Customizations). InvalidOperation: The property 'ChartStyle' cannot be found on this object. Verify that the property exists and can be set. InvalidOperation: The property 'ChartStyle' cannot be found on this object. Verify that the property exists and can be set. InvalidOperation: The property 'ChartStyle' cannot be found on this object. Verify that the property exists and can be set. DEBUG: 2024-10-17_08_57_17 - Finished Charts Phase. Report Complete. Total Runtime was: 21,79 Minutes Total Resources: 71341 Total Advisories: 0 Total Policies: 14

But it merged cached info from yesterday running

DEBUG: 2024-10-17_08_48_02 - Checking if Should Generate Policy Sheet. Get-Job: The command cannot find the job because the job name Policy was not found. Verify the value of the Name parameter, and then try the command again. DEBUG: 2024-10-17_08_48_02 - Checking if Should Generate Advisory Sheet. Get-Job: The command cannot find the job because the job name Advisory was not found. Verify the value of the Name parameter, and then try the command again. DEBUG: 2024-10-17_08_48_02 - Generating Subscription sheet. DEBUG: 2024-10-17_08_48_14 - Excel Jobs Compleated. DEBUG: 2024-10-17_08_48_14 - Starting Ordering of Cached Files. DEBUG: 2024-10-17_08_48_14 - Ordering Cached Files. DEBUG: 2024-10-17_08_48_14 - Merging Cached File for: ComputerVision DEBUG: 2024-10-17_08_48_15 - Merging Cached File for: DocIntelligence DEBUG: 2024-10-17_08_48_15 - Merging Cached File for: EvtHub DEBUG: 2024-10-17_08_48_15 - Merging Cached File for: MachineLearning DEBUG: 2024-10-17_08_48_16 - Merging Cached File for: SearchServices DEBUG: 2024-10-17_08_48_16 - Merging Cached File for: Synapse DEBUG: 2024-10-17_08_48_16 - Merging Cached File for: Translator DEBUG: 2024-10-17_08_48_17 - Merging Cached File for: WrkSpace DEBUG: 2024-10-17_08_48_19 - Merging Cached File for: AdvisorScore DEBUG: 2024-10-17_08_48_20 - Merging Cached File for: ManagedIds DEBUG: 2024-10-17_08_48_21 - Merging Cached File for: SupportTickets DEBUG: 2024-10-17_08_48_22 - Merging Cached File for: APPSERVICEPLAN DEBUG: 2024-10-17_08_48_23 - Merging Cached File for: APPSERVICES DEBUG: 2024-10-17_08_48_25 - Merging Cached File for: VM DEBUG: 2024-10-17_08_48_38 - Merging Cached File for: VMSS DEBUG: 2024-10-17_08_48_38 - Merging Cached File for: CosmosDB DEBUG: 2024-10-17_08_48_39 - Merging Cached File for: MySQLflexible DEBUG: 2024-10-17_08_48_39 - Merging Cached File for: RedisCache DEBUG: 2024-10-17_08_48_39 - Merging Cached File for: SQLDB DEBUG: 2024-10-17_08_48_40 - Merging Cached File for: SQLMI DEBUG: 2024-10-17_08_48_40 - Merging Cached File for: SQLMIDB DEBUG: 2024-10-17_08_48_40 - Merging Cached File for: SQLSERVER DEBUG: 2024-10-17_08_48_41 - Merging Cached File for: SQLVM DEBUG: 2024-10-17_08_48_41 - Merging Cached File for: AppGW DEBUG: 2024-10-17_08_48_42 - Merging Cached File for: AutomationAcc DEBUG: 2024-10-17_08_48_42 - Merging Cached File for: AvSet DEBUG: 2024-10-17_08_48_43 - Merging Cached File for: BASTION DEBUG: 2024-10-17_08_48_43 - Merging Cached File for: RecoveryVault DEBUG: 2024-10-17_08_48_44 - Merging Cached File for: Vault DEBUG: 2024-10-17_08_48_46 - Merging Cached File for: APIM DEBUG: 2024-10-17_08_48_46 - Merging Cached File for: AppInsights DEBUG: 2024-10-17_08_48_48 - Merging Cached File for: Backup DEBUG: 2024-10-17_08_48_53 - Merging Cached File for: NetApp DEBUG: 2024-10-17_08_48_54 - Merging Cached File for: StorageAcc DEBUG: 2024-10-17_08_48_59 - Merging Cached File for: VMDisk DEBUG: 2024-10-17_08_49_10 - Starting Resource Reporting. DEBUG: 2024-10-17_08_49_10 - Starting Reporting Phase.

Regards, Red.

Claudio-Merola commented 3 weeks ago

Hi @red-erik

That's interesting, without the network folder it ran in like 21 minutes, right?

Can you put the network folder back in and now just remove the Firewall and Load Balancer file?

I assume these are the most complex and likely to cause issues like that.

Regarding the script merging content from yesterday, this happened because the script was interrupted, so it was not able to delete the cache files.

red-erik commented 3 weeks ago

Hello, the offending module is the LoadBalancer one. Without it we are at 99% so I suppose it will finish soon.

Regards, Red.

red-erik commented 3 weeks ago

Hello, confirmed, the Load Balancer script is the root cause.

Regards, Red.