jetstack / version-checker

Kubernetes utility for exposing image versions in use, compared to latest available upstream, as metrics.
https://jetstack.io
Apache License 2.0
705 stars 79 forks source link

[BUG] Missing container metrics with multiple containers in a pod #274

Open oGi4i opened 2 months ago

oGi4i commented 2 months ago

Describe the bug If a pod has more than 1 container than there is only 1 metric reported

To Reproduce Steps to reproduce the behavior:

  1. Deploy a pod with multiple containers and enabled version check for all of them
  2. Check Prometheus with the following Query:
    version_checker_is_latest_version{exported_namespace="<your_namespace>", exported_pod="<your_pod_name>"}
  3. The actual metrics are:
    version_checker_is_latest_version{container="version-checker", container_type="container", current_version="<image_2_version>", endpoint="web", exported_container="<container_name_2>", exported_namespace="<your_namespace>", exported_pod="<your_pod_name>", image="<image_2>", instance="10.255.254.150:8080", job="version-checker", latest_version="<image_2_latest_version>", namespace="version-checker", pod="version-checker-6dd5b85f7d-4s86k", service="version-checker"} = 1
  4. The expected metrics should be:
    version_checker_is_latest_version{container="version-checker", container_type="container", current_version="<image_1_version>", endpoint="web", exported_container="<container_name_1>", exported_namespace="<your_namespace>", exported_pod="<your_pod_name>", image="<image_1>", instance="10.255.254.150:8080", job="version-checker", latest_version="<image_1_latest_version>", namespace="version-checker", pod="version-checker-6dd5b85f7d-4s86k", service="version-checker"} = 1
    version_checker_is_latest_version{container="version-checker", container_type="container", current_version="<image_2_version>", endpoint="web", exported_container="<container_name_2>", exported_namespace="<your_namespace>", exported_pod="<your_pod_name>", image="<image_2>", instance="10.255.254.150:8080", job="version-checker", latest_version="<image_2_latest_version>", namespace="version-checker", pod="version-checker-6dd5b85f7d-4s86k", service="version-checker"} = 1

Expected behavior 1 metric is reported per container

Environment (please complete the following information):

Additional context The root cause of the bug is here - using partial labels to delete metrics, which only contain namespace and pod. This actually deletes all the metrics for a pod when really we want to delete only the metric for the current container that is being checked. You can easily add a container label to this list and it should fix the issue.