Open Shivani3351 opened 7 months ago
Sensu checks are not executing occasionally on scheduled time
Checks should never stop being scheduled unless they are no longer published.
Some sensu checks are not executing occasionally on scheduled time .
Sensu logs : influxdb handler asset configured {"component":"schedulerd","cron":"0 /2 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-elkes-data-backup-validation","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:21Z"} {"component":"schedulerd","cron":"/35 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-netty-thread-status-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:21Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-basic-status","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-connection-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-mdm-ui-server-status","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-rdp-api-server-status","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-status-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"0 /12 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tenant-rdp-validation-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:23Z"} {"component":"schedulerd","cron":"0 /12 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tenant-system-user-validation-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:23Z"} {"component":"schedulerd","cron":"0 0 *","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"topology-apm-stats-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:23Z"}
1. 2. 3. 4.
influxdb handler asset configured {"component":"schedulerd","cron":"0 /2 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-elkes-data-backup-validation","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:21Z"} {"component":"schedulerd","cron":"/35 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-netty-thread-status-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:21Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-basic-status","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-connection-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-mdm-ui-server-status","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-rdp-api-server-status","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-status-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"0 /12 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tenant-rdp-validation-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:23Z"} {"component":"schedulerd","cron":"0 /12 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tenant-system-user-validation-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:23Z"} {"component":"schedulerd","cron":"0 0 *","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"topology-apm-stats-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:23Z"}
I had a similar issue and I solved it by building a cluster. We have around 19.000 checks spread across 600 hosts. Made a cluster a 3 node backends and a 3 node etcd cluster. No issue since.
Sensu checks are not executing occasionally on scheduled time
Expected Behavior
Checks should never stop being scheduled unless they are no longer published.
Current Behavior
Some sensu checks are not executing occasionally on scheduled time .
Sensu logs : influxdb handler asset configured {"component":"schedulerd","cron":"0 /2 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-elkes-data-backup-validation","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:21Z"} {"component":"schedulerd","cron":"/35 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-netty-thread-status-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:21Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-basic-status","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-connection-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-mdm-ui-server-status","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-rdp-api-server-status","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-status-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"0 /12 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tenant-rdp-validation-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:23Z"} {"component":"schedulerd","cron":"0 /12 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tenant-system-user-validation-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:23Z"} {"component":"schedulerd","cron":"0 0 *","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"topology-apm-stats-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:23Z"}
Possible Solution
Steps to Reproduce (for bugs)
1. 2. 3. 4.
Context
Some sensu checks are not executing occasionally on scheduled time .
influxdb handler asset configured {"component":"schedulerd","cron":"0 /2 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-elkes-data-backup-validation","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:21Z"} {"component":"schedulerd","cron":"/35 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-netty-thread-status-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:21Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-basic-status","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-connection-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-mdm-ui-server-status","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-rdp-api-server-status","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"/4 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tech-nginx-status-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:22Z"} {"component":"schedulerd","cron":"0 /12 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tenant-rdp-validation-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:23Z"} {"component":"schedulerd","cron":"0 /12 ","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"tenant-system-user-validation-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:23Z"} {"component":"schedulerd","cron":"0 0 *","error":"error while starting ring watcher: context canceled","level":"error","msg":"error scheduling check","name":"topology-apm-stats-alert","namespace":"default","scheduler_type":"round-robin cron","time":"2024-02-07T16:27:23Z"}
Your Environment