Vector should still be pushing logs from some containers like node-exporter, cAdvisor, flannel, etc. However, it would appear that these container produce very few logs and trigger this alert.
Vector is also still pushing kernel logs over above a certain level. Again, it seems that a good chunk of machines do not push kernel logs frequently enough to make a reliable alert.
All this said, we have observed that Vector sometimes gets into an error state with regard to GCP, which has previously only been visible because of the alert being removed by this commit. We need to figure out a way to alert when Vector cannot push logs to Stackdriver, but does not exit or produce anything but a log message. Maybe Vector can push its own logs to Stackdriver, and we can search for error messages in a custom log-based metric?
We recently stopped pushing experiment logs to GCP:
https://github.com/m-lab/k8s-support/pull/849
Vector should still be pushing logs from some containers like node-exporter, cAdvisor, flannel, etc. However, it would appear that these container produce very few logs and trigger this alert.
Vector is also still pushing kernel logs over above a certain level. Again, it seems that a good chunk of machines do not push kernel logs frequently enough to make a reliable alert.
All this said, we have observed that Vector sometimes gets into an error state with regard to GCP, which has previously only been visible because of the alert being removed by this commit. We need to figure out a way to alert when Vector cannot push logs to Stackdriver, but does not exit or produce anything but a log message. Maybe Vector can push its own logs to Stackdriver, and we can search for error messages in a custom log-based metric?
This change is