If ansible-pull does not complete successfully this is currently only visible in the logs (and usually something is broken). It would be very useful to visualize somehow.
Using a role notification role like the ansible-role-flowdock is not the best - if the play doesn't even start or it fails before the notification role - there won't be a message.
It would be cool if the solution used doesn't actually depend on Internet. Having the pull-script do a curl -XPUT to grafana/influxdb directly wouldn't be ideal as having a broken NAT is/routing is quite easy to accomplish. Something with a buffer - or at least a proxy via for example the admin node would be nice.
It would also be nice if it could be stored in influxdb and visualized in grafana - no need to maintain an ELK.
Ideas are welcome.
Configure all sites to send ansible logs to ulysses ELK and then visualized in a dashboard
Have ansible-pull-script.sh write status (started, running, failed?, successful) to a file. Then have NHC check the contents (and age?) of that file.
If ansible-pull does not complete successfully this is currently only visible in the logs (and usually something is broken). It would be very useful to visualize somehow. Using a role notification role like the ansible-role-flowdock is not the best - if the play doesn't even start or it fails before the notification role - there won't be a message.
It would be cool if the solution used doesn't actually depend on Internet. Having the pull-script do a curl -XPUT to grafana/influxdb directly wouldn't be ideal as having a broken NAT is/routing is quite easy to accomplish. Something with a buffer - or at least a proxy via for example the admin node would be nice.
It would also be nice if it could be stored in influxdb and visualized in grafana - no need to maintain an ELK.
Ideas are welcome.