Open d-mankowski-synerise opened 10 months ago
@d-mankowski-synerise nothing has been changed in this area I'll have a look anyway to double check
@chen-keinan IMO the problem is not related to metrics, but to creation of vulnerability reports - the digest
field is missing in reports created by operator 0.18.1
It looks like there are still some reports that contain digest
field - weird. But there is two times more metrics that have no image_digest
compared to the metrics that do have image_digest
label
@d-mankowski-synerise I do not think that digest info is always available for trivy. this is how digest info is set
This wasn't the case with operator 0.16.4
- as you can see above, before the upgrade we didn't have a single case of trivy_image_vulnerabilities
without image_digest
({image_digest=""}
) label
This wasn't the case with operator
0.16.4
- as you can see above, before the upgrade we didn't have a single case oftrivy_image_vulnerabilities
withoutimage_digest
({image_digest=""}
) label
wired its look like the logic is the same for 0.16.4
It was ok in 0.16.4 and every version above had the same issue, which me and @d-mankowski-synerise decided to nail down now with 0.18.1 version. So my guess it that one of commits for 0.17.0 version probably by some mistake broke logic for digests.
I will rollback to 0.16.4 with the same Trivy version (0.48.2) to make sure it is not related to Trivy itself. With 0.16.4 we used 0.48.0
, so I am doubtful it is the cause, but it won't hurt to exclude the possibility.
@d-mankowski-synerise do you have a specific public image which produce image_digest
with v0.16.4
and do not produce the same metric with v0.18.1
which I can test with ?
@chen-keinan
this query: group by (image_registry, image_repository, image_tag, image_digest) (trivy_image_vulnerabilities{image_registry="ghcr.io"})
returns the following:
{image_registry="ghcr.io", image_repository="external-secrets/external-secrets", image_tag="v0.8.3"}
1
{image_registry="ghcr.io", image_repository="aquasecurity/trivy-operator", image_tag="0.18.1", image_digest="sha256:19633ccb72c369e90d22e38eddd86fbc8f43851cee68c9d7d6acadd5cc053775"}
1
{image_registry="ghcr.io", image_repository="aquasecurity/trivy-operator", image_tag="0.18.1"}
1
which gets even weirder - metrics regarding vulnerabilities of image ghcr.io/aquasecurity/trivy-operator/0.18.1
are exposed twice, one time with image_digest, one time without
@chen-keinan
this query:
group by (image_registry, image_repository, image_tag, image_digest) (trivy_image_vulnerabilities{image_registry="ghcr.io"})
returns the following:
{image_registry="ghcr.io", image_repository="external-secrets/external-secrets", image_tag="v0.8.3"} 1 {image_registry="ghcr.io", image_repository="aquasecurity/trivy-operator", image_tag="0.18.1", image_digest="sha256:19633ccb72c369e90d22e38eddd86fbc8f43851cee68c9d7d6acadd5cc053775"} 1 {image_registry="ghcr.io", image_repository="aquasecurity/trivy-operator", image_tag="0.18.1"} 1
which gets even weirder - metrics regarding vulnerabilities of image
ghcr.io/aquasecurity/trivy-operator/0.18.1
are exposed twice, one time with image_digest, one time without
maybe there is a bug where one metric override the other?
IMO 2 candidates for introduction of such bug would be 2 changes in 0.17.0 related to metrics - addition of OS Info metrics and addition of clusterCompliance_info metrics.
@d-mankowski-synerise @LesSyner thanks I'll try to reproduce it and fix it
@d-mankowski-synerise are you sure its duplicate metric , should be on metric for each severity , example:
# HELP trivy_image_vulnerabilities Number of container image vulnerabilities
# TYPE trivy_image_vulnerabilities gauge
trivy_image_vulnerabilities{container_name="coredns",image_digest="sha256:ead0a4a53df89fd173874b46093b6e62d8c72967bbf606d672c9e8c9b601a4fc",image_registry="index.docker.io",image_repository="rancher/mirrored-coredns-coredns",image_tag="1.10.1",name="replicaset-coredns-576cfbb478-coredns",namespace="kube-system",resource_kind="ReplicaSet",resource_name="coredns-576cfbb478",severity="Critical"} 0
trivy_image_vulnerabilities{container_name="coredns",image_digest="sha256:ead0a4a53df89fd173874b46093b6e62d8c72967bbf606d672c9e8c9b601a4fc",image_registry="index.docker.io",image_repository="rancher/mirrored-coredns-coredns",image_tag="1.10.1",name="replicaset-coredns-576cfbb478-coredns",namespace="kube-system",resource_kind="ReplicaSet",resource_name="coredns-576cfbb478",severity="High"} 3
trivy_image_vulnerabilities{container_name="coredns",image_digest="sha256:ead0a4a53df89fd173874b46093b6e62d8c72967bbf606d672c9e8c9b601a4fc",image_registry="index.docker.io",image_repository="rancher/mirrored-coredns-coredns",image_tag="1.10.1",name="replicaset-coredns-576cfbb478-coredns",namespace="kube-system",resource_kind="ReplicaSet",resource_name="coredns-576cfbb478",severity="Low"} 0
trivy_image_vulnerabilities{container_name="coredns",image_digest="sha256:ead0a4a53df89fd173874b46093b6e62d8c72967bbf606d672c9e8c9b601a4fc",image_registry="index.docker.io",image_repository="rancher/mirrored-coredns-coredns",image_tag="1.10.1",name="replicaset-coredns-576cfbb478-coredns",namespace="kube-system",resource_kind="ReplicaSet",resource_name="coredns-576cfbb478",severity="Medium"} 4
trivy_image_vulnerabilities{container_name="coredns",image_digest="sha256:ead0a4a53df89fd173874b46093b6e62d8c72967bbf606d672c9e8c9b601a4fc",image_registry="index.docker.io",image_repository="rancher/mirrored-coredns-coredns",image_tag="1.10.1",name="replicaset-coredns-576cfbb478-coredns",namespace="kube-system",resource_kind="ReplicaSet",resource_name="coredns-576cfbb478",severity="Unknown"} 0
can you please share the full metric for trivy-operator
?
btw: above is produced with trivy-operator v0.18.1
@chen-keinan yup, I am sure:
> curl -s localhost:8080/metrics | grep 'aquasecurity/trivy-operator'
trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-564c8d89bd-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-564c8d89bd",severity="Critical"} 0
trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-564c8d89bd-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-564c8d89bd",severity="High"} 0
trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-564c8d89bd-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-564c8d89bd",severity="Low"} 0
trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-564c8d89bd-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-564c8d89bd",severity="Medium"} 0
trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-564c8d89bd-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-564c8d89bd",severity="Unknown"} 0
trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="sha256:19633ccb72c369e90d22e38eddd86fbc8f43851cee68c9d7d6acadd5cc053775",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-675cf74d45-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-675cf74d45",severity="Critical"} 0
trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="sha256:19633ccb72c369e90d22e38eddd86fbc8f43851cee68c9d7d6acadd5cc053775",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-675cf74d45-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-675cf74d45",severity="High"} 0
trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="sha256:19633ccb72c369e90d22e38eddd86fbc8f43851cee68c9d7d6acadd5cc053775",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-675cf74d45-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-675cf74d45",severity="Low"} 0
trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="sha256:19633ccb72c369e90d22e38eddd86fbc8f43851cee68c9d7d6acadd5cc053775",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-675cf74d45-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-675cf74d45",severity="Medium"} 2
trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="sha256:19633ccb72c369e90d22e38eddd86fbc8f43851cee68c9d7d6acadd5cc053775",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-675cf74d45-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-675cf74d45",severity="Unknown"} 0
where localhost
is port-forwarded trivy-operator's pod
@chen-keinan yup, I am sure:
> curl -s localhost:8080/metrics | grep 'aquasecurity/trivy-operator' trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-564c8d89bd-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-564c8d89bd",severity="Critical"} 0 trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-564c8d89bd-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-564c8d89bd",severity="High"} 0 trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-564c8d89bd-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-564c8d89bd",severity="Low"} 0 trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-564c8d89bd-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-564c8d89bd",severity="Medium"} 0 trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-564c8d89bd-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-564c8d89bd",severity="Unknown"} 0 trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="sha256:19633ccb72c369e90d22e38eddd86fbc8f43851cee68c9d7d6acadd5cc053775",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-675cf74d45-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-675cf74d45",severity="Critical"} 0 trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="sha256:19633ccb72c369e90d22e38eddd86fbc8f43851cee68c9d7d6acadd5cc053775",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-675cf74d45-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-675cf74d45",severity="High"} 0 trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="sha256:19633ccb72c369e90d22e38eddd86fbc8f43851cee68c9d7d6acadd5cc053775",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-675cf74d45-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-675cf74d45",severity="Low"} 0 trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="sha256:19633ccb72c369e90d22e38eddd86fbc8f43851cee68c9d7d6acadd5cc053775",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-675cf74d45-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-675cf74d45",severity="Medium"} 2 trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="sha256:19633ccb72c369e90d22e38eddd86fbc8f43851cee68c9d7d6acadd5cc053775",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-675cf74d45-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-675cf74d45",severity="Unknown"} 0
where
localhost
is port-forwarded trivy-operator's pod
I see you are using the labels k8s_label_*
, I'll debug it
yup
trivyOperator:
scanJobCompressLogs: true
reportResourceLabels: app,synerise.com/owner.team
I am confused. The number of metrics without digest had started to go down, but recently - it has started to go up again:
And the number of metrics with digest has also started to go up:
we didn't change any operator settings (only bumped resources, since we noticed some throttling), TTL is set to 24h.
After checking trivy-operator
logs, I noticed that this error is printed quite often:
❯ kubectl logs trivy-operator-566589c494-hjpjs | grep 'unable to get missing layers' -c
39
{
"level": "error",
"ts": "2024-01-16T21:17:25Z",
"logger": "reconciler.scan job",
"msg": "Scan job container",
"job": "trivy-operator/scan-vulnerabilityreport-68f456865b",
"container": "init",
"status.reason": "Error",
"status.message": "2024-01-16T21:17:16.599Z\t\u001b[31mFATAL\u001b[0m\timage scan error: scan error: scan failed: failed analysis: unable to get missing layers: unable to fetch missing layers: twirp error internal: failed to do request: Post \"http://trivy-service.trivy-operator:4954/twirp/trivy.cache.v1.Cache/MissingBlobs\": dial tcp 10.244.244.70:4954: connect: connection refused\n",
"stacktrace": "github.com/aquasecurity/trivy-operator/pkg/vulnerabilityreport/controller.(*ScanJobController).processFailedScanJob\n\t/home/runner/work/trivy-operator/trivy-operator/pkg/vulnerabilityreport/controller/scanjob.go:346\ngithub.com/aquasecurity/trivy-operator/pkg/vulnerabilityreport/controller.(*ScanJobController).SetupWithManager.(*ScanJobController).reconcileJobs.func1\n\t/home/runner/work/trivy-operator/trivy-operator/pkg/vulnerabilityreport/controller/scanjob.go:81\nsigs.k8s.io/controller-runtime/pkg/reconcile.Func.Reconcile\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/reconcile/reconcile.go:111\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:119\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:316\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:266\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/home/runner/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.16.3/pkg/internal/controller/controller.go:227"
}
but this seems to be related to Trivy itself, not the operator?
@d-mankowski-synerise I do not this the error you mention related to missing digest.
@d-mankowski-synerise looking again at the example of duplicate metric you put above, if you take a look at the resource name you'll see its has a different name meaning its not the same resource, could be that the data is coming from an old report before upgrade:
trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-564c8d89bd-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-564c8d89bd",severity="Medium"} 0
compare to :
trivy_image_vulnerabilities{container_name="trivy-operator",image_digest="sha256:19633ccb72c369e90d22e38eddd86fbc8f43851cee68c9d7d6acadd5cc053775",image_registry="ghcr.io",image_repository="aquasecurity/trivy-operator",image_tag="0.18.1",k8s_label_app="",k8s_label_synerise_com_owner_team="",name="replicaset-trivy-operator-675cf74d45-trivy-operator",namespace="trivy-operator",resource_kind="ReplicaSet",resource_name="trivy-operator-675cf74d45",severity="Medium"} 2
let me know wdyt
What steps did you take and what happened:
After upgrading trivy-operator to the latest version (
0.18.1
, chart version:0.20.1
) from0.16.4
, labelimage_digest
is missing in metrictrivy_image_vulnerabilities
(deployment of a new version was around midnight):This is problematic, because we can have, for example, two images
alpine:latest
, and one can be a year old, while the other - a recent one. And this makes dashboards regarding CVEs in Grafana difficult to maintain, since there is no convenient way to group images by some label.I haven't seen this change mentioned anywhere in the changelog, hence this should be considered as a bug.
The problem, I think, is caused by the lack of
digest
field invulnerabilityreports
. For example, report created by operator 0.16.4:while one created by
0.18.1
:What did you expect to happen:
trivy_image_vulnerabilities
exposesimage_digest
labelAnything else you would like to add:
I haven't made any changes to the config when upgrading:
Environment:
trivy-operator version
): 0.18.1kubectl version
): 1.27.7