cdot65 / pan-os-upgrade

An efficient tool to execute configuration backups, network state snapshots, system readiness checks, and operating system upgrades of Palo Alto Networks firewalls and Panorama appliances.
https://cdot65.github.io/pan-os-upgrade/
Apache License 2.0
39 stars 7 forks source link

Implement environmental status capture and failure reporting #135

Open cdot65 opened 4 months ago

cdot65 commented 4 months ago

Is your feature request related to a problem? Please describe. When upgrading PAN-OS on firewalls using the pan-os-upgrade utility, it is important to monitor the environmental status of the devices before and after the upgrade process. The environmental status includes information such as temperature, fan speed, power supply health, and other hardware-related metrics. Capturing this information helps in identifying any potential hardware issues or failures that may impact the upgrade process or the device's stability after the upgrade. Currently, the utility does not have a built-in mechanism to capture the environmental status and report on any failures.

Describe the solution you'd like Enhance the pan-os-upgrade utility to include the ability to capture the environmental status of the devices before and after the upgrade process and report on any failures or anomalies. The utility should:

  1. Use the PAN-OS SDK to execute the equivalent of the show system environmentals command on the firewall to retrieve the environmental status information.
  2. Parse the environmental status information returned by the SDK and extract relevant metrics, such as:
    • Temperature readings for critical components (e.g., CPU, power supplies, fan trays)
    • Fan speeds and operational status
    • Power supply status and health
    • Any other pertinent environmental metrics
  3. Store the captured environmental status information in a structured format (e.g., JSON or XML) along with metadata such as the device model, serial number, and timestamp.
  4. Proceed with the normal upgrade process.
  5. After the upgrade is completed and the firewall is back online, re-capture the environmental status information using the same SDK command.
  6. Compare the pre-upgrade and post-upgrade environmental status information to identify any changes, failures, or anomalies.
  7. Generate a report or display the comparison results to the user, highlighting any issues or potential problems detected.
  8. Implement threshold-based alerting or notifications for critical environmental metrics, such as high temperatures or failed power supplies.
  9. Provide recommendations or suggested actions to address any identified environmental failures or issues.

Describe alternatives you've considered An alternative approach could be to rely on external monitoring systems or SNMP traps to capture and monitor the environmental status of the devices. However, this would require additional setup and integration efforts and may not provide a seamless experience within the pan-os-upgrade utility itself.

Additional context Here are a few additional points to consider:

By implementing this feature, the pan-os-upgrade utility will provide a comprehensive solution for capturing and monitoring the environmental status of the devices before and after the upgrade process. This will help in identifying any potential hardware failures or issues that may impact the upgrade success or the device's stability, enabling proactive troubleshooting and remediation. It enhances the overall reliability and resilience of the upgrade workflow.