sscargal / pmemchk

MIT License
0 stars 1 forks source link

[analyzer] Don't execute rules if no PMem modules are found #102

Open sscargal opened 2 years ago

sscargal commented 2 years ago

Here's the output collected from a VM running OpenSUSE 15.3 with no PMem. Some of the tests pass, others fail, and we see errors. pmemchk shouldn't execute any rules if PMem isn't found in the environment.

> sudo ./pmemchk 
=======================================================================
Starting PMem Checker
pmemchk Version 0.1.0
Started: Fri Apr 15 16:40:27 UTC 2022
=======================================================================
Using NDCTL command: /usr/bin/ndctl
NDCTL version: 71.1
Using IPMCTL command: /usr/bin/ipmctl
IPMCTL version: run
INFO: cxl command not found! Use -c to specify the location.
Operating System: openSUSE Leap 15.3
Kernel Version : 5.3.18-150300.59.43-default
CPU(s):                          1
Socket(s):                       1
NUMA node(s):                    1
Model name:                      Intel(R) Xeon(R) CPU @ 2.30GHz
NUMA node0 CPU(s):               0
=======================================================================
Starting data collector
=======================================================================
Running IPMCTL Collector
100% (32 of 32)  [====================] 
Running NDCTL Collector
100% (8 of 8)  [====================] 
Running CXL Collector
100% (3 of 3)  [====================] 
Collecting files
100% (1 of 1)  [====================] 
Collecting command outputs
100% (2 of 2)  [====================] 
Post-Processing Collected Data
100% (5 of 5)  [====================] 
=======================================================================
Data collector completed
=======================================================================
=======================================================================
Starting analysis of the data
=======================================================================
=======================================================================
./analyzer/optane/check_FWUpdateStatus: line 39: FW_Version[${PropertyValue}]: bad array subscript
[ PASSED   ] optane_check_dimm_extended_adr_enabled : All PMem modules have ExtendedAdrEnabled switch disabled
./analyzer/optane/check_dimm_MediaTemperatureInjectionEnabled: line 39: MEDIATEMP_INJECTION[${PropertyValue}]: bad array subscript
./analyzer/optane/check_dimm_MediaTemperature_Injection_Counter: line 39: MEDIA_TEMP_INJ_COUNTER[${PropertyValue}]: bad array subscript
[ INFO     ] optane_check_dimm_ppc_extended_adr_enabled : One or more PMem modules have PpcExtendedAdrEnabled=False (Disabled)
[ PASSED   ] optane_check_dimm_software_triggers_enabled_details : All PMem modules have software trigger disabled
[ FAILED   ] optane_check_dimm_ait_dram_enabled : One or more PMem modules have AitDramEnabled=False (Disabled)
./analyzer/optane/check_dimm_arsstatus: line 39: DIMM_HEALTH[${PropertyValue}]: bad array subscript
[ PASSED   ] optane_check_dimm_boot_status : All PMem modules have successful BootStatus 
./analyzer/optane/check_dimm_capacity: line 35: DIMM_CAPACITY_ARR[${DimmID}]: bad array subscript
./analyzer/optane/check_dimm_configurationstatus: line 39: DIMM_HEALTH[${PropertyValue}]: bad array subscript
[ PASSED   ] optane_check_dimm_error_injection_enabled : All PMem modules have error injection disabled
./analyzer/optane/check_dimm_firmware_version: line 35: DIMM_FW[${DimmID}]: bad array subscript
./analyzer/optane/check_dimm_health_status: line 37: DIMM_HEALTHSTATE_COUNT[${HealthState}]: bad array subscript
./analyzer/optane/check_dimm_lockstate: line 60: DIMM_LOCKSTATE_COUNT[${LockState}]: bad array subscript
[ PASSED   ] optane_check_dimm_memoryboostfeature : Memory Bandwidth Boot Feature is Disabled
[ INFO     ] optane_check_dimm_overwrite_status : All PMem modules have a definite OverwriteStatus
./analyzer/optane/check_dimm_packagesparesavailable: line 39: DIMM_HEALTH[${PropertyValue}]: bad array subscript
./analyzer/optane/check_dimm_packagesparingcapable: line 39: DIMM_HEALTH[${PropertyValue}]: bad array subscript
./analyzer/optane/check_dimm_packagesparingenabled: line 39: DIMM_HEALTH[${PropertyValue}]: bad array subscript
./analyzer/optane/check_dimm_partnumber: line 35: DIMM_HEALTH[${PropertyValue}]: bad array subscript
./analyzer/optane/check_dimm_percentage_remaining: line 26: -1: substring expression < 0
[ FAILED   ] optane_check_dimm_show_goal : A pending goal configuration exists and will be applied on the next system reboot 
./analyzer/optane/check_dimm_skuviolation: line 39: DIMM_HEALTH[${PropertyValue}]: bad array subscript
[ PASSED   ] optane_check_dimm_software_trigger_counter : All PMem modules have Software Trigger Counter == 0
./analyzer/optane/check_dimm_thermalthrottlelosspercent: line 48: DIMM_HEALTH[${PropertyValue}]: bad array subscript
[ PASSED   ] optane_check_dimm_viral_policy : All PMem modules have Viral Policy disabled
[ PASSED   ] optane_check_dimm_viral_state : All PMem modules are not Viral
./analyzer/optane/check_masterpassphraseenabled: line 41: DIMM_HEALTH[${PropertyValue}]: bad array subscript
./analyzer/optane/check_region_capacity: line 34: REGION_CAPACITY[${Capacity}]: bad array subscript
./analyzer/optane/check_region_freecapacity: line 34: REGION_FREECAPACITY[${FreeCapacity}]: bad array subscript
./analyzer/optane/check_region_health: line 38: REGION_HEALTH[${HealthState}]: bad array subscript
./analyzer/optane/check_region_persistentmemorytype: line 34: REGION_TYPE[${PersistentMemoryType}]: bad array subscript
=======================================================================
Data analysis completed
=======================================================================
=======================================================================
Analysis Report Summary
=======================================================================
[ PASSED   ] = 10
[ FAILED   ] = 3
[ INFO     ] = 1
[ WARNING  ] = 0
=======================================================================
PMem Checker Complete
Ended: Fri Apr 15 16:40:28 UTC 2022
Duration: 1 seconds
Results: ./pmemchk.instance-1.0415-1640
=======================================================================
sscargal commented 2 years ago

Need to avoid running the 'optane' module if we detect no PMem modules. See #67