puppetlabs / puppetlabs-pe_status_check

Self Service Module for Puppet Enterprise
Apache License 2.0
1 stars 33 forks source link

(#236) agent_state_summary: Count nodes without report as unhealthy #238

Closed bastelfreak closed 6 days ago

bastelfreak commented 1 month ago

It's possible that a Puppet Agent was stopped or disabled and all old reports were garbage collected from PuppetDB. The node still exists in PuppetDB, but when checking for a report the timestamp is null:

puppet query nodes[certname,report_timestamp]{}
[
  {
    "certname": "pe.tim.local",
    "report_timestamp": "2024-09-30T13:21:17.042Z"
  },
  {
    "certname": "pe2.tim.local",
    "report_timestamp": null
  }
]

Previously we always assumed that report_timestamp has a valid timestamp. With this patch we explicitly validate the timestamp and count nodes withhout a timestamp as unhealthy.

Now with the fix:

puppet plan run pe_status_check::agent_state_summary --environment peadm log_healthy_nodes=true log_unhealthy_nodes=true
{
    "responsive": [
        "pe.tim.local",
        "pe2.tim.local"
    ],
    "healthy_counter": 0,
    "total_counter": 2,
    "unhealthy_counter": 2,
    "noop": [],
    "unhealthy": [
        "pe2.tim.local",
        "pe.tim.local"
    ],
    "healthy": [],
    "changed": [
        "pe.tim.local"
    ],
    "no_report": [
        "pe.tim.local"
    ],
    "corrective_changes": [],
    "used_cached_catalog": [
        "pe2.tim.local"
    ],
    "unresponsive": [],
    "failed": []
}

Please check off the steps below as you complete each step

bastelfreak commented 1 month ago

pe2.tim.local is listed here as used_cached_catalog. That's another bug, fixed in https://github.com/puppetlabs/puppetlabs-pe_status_check/pull/237

taikaa commented 3 weeks ago

@bastelfreak apologies for the delay to review the PR. I tested the PR and no longer get the error. Thanks for adding this

taikaa commented 3 weeks ago

@MartyEwings hello are these failed tests alright to merge this PR? Thank you!

bastelfreak commented 1 week ago

Because nobody is reviewing this I raised support ticket #01302632.

bastelfreak commented 6 days ago

@MartyEwings thanks for merging! Would it be possible to get a new release?