sse-secure-systems / connaisseur

An admission controller that integrates Container Image Signature Verification into a Kubernetes cluster
https://sse-secure-systems.github.io/connaisseur/
Apache License 2.0
442 stars 62 forks source link

Latency of connaisseur cosign verification is around 2s. Is this normal? #1526

Closed qx121 closed 8 months ago

qx121 commented 8 months ago

Describe the bug From the log we observed that log time between starting verification of image and successful verification of image is around 2s for cosign validated request. For auto-approved request it is much faster(around 1ms).

We are concerned that 2s is not a small number which potentially impacts our SLO of pod creation.

Wondering whether is way to find out where the latency comes from(communication with registry, signature verification, connaisseur performance itself, etc)

Optional: To reproduce

**Optional: Versions (please complete the following information as relevant):** - OS: - Kubernetes Cluster: - Notary Server: - Container registry: - Connaisseur: 3.0.0 - Other: **Optional: Additional context**
Starkteetje commented 8 months ago

Hi @qx121 yes, that is normal and is usual Cosign behaviour. If you verify an image with Cosign itself (not using Connaisseur), it already takes that time. You can run cosign generate-key-pair to generate some syntactically valid key and then time cosign verify --key cosign.pub securesystemsengineering/testimage:co-signed or similar on any other image.

$ time cosign verify --key cosign.pub securesystemsengineering/testimage:co-signed
Error: no matching signatures: searching log query: [POST /api/v1/log/entries/retrieve][400] searchLogQueryBadRequest  &{Code:400 Message:verifying signature: invalid signature when validating ASN.1 encoded signature}
main.go:69: error during command execution: no matching signatures: searching log query: [POST /api/v1/log/entries/retrieve][400] searchLogQueryBadRequest  &{Code:400 Message:verifying signature: invalid signature when validating ASN.1 encoded signature}

real    0m2,020s
user    0m0,157s
sys 0m0,277s

As you can see, the validation takes around 2s already. This is expected as Cosign needs to retrieve the digest for the tag, the signature for the digest and potentially some related transparency log entries and probably does a bunch more requests (e.g. authenticating to the registry first, to get a token, then doing the actual access requests...), so quite a bit of networking. The actual cryptographic verification itself should be fast. If the 2s are a bottleneck for you, the main criterion to improve that I see is to get Cosign (or in this case Connaisseur) closer to the registry in terms of network latency.

So in terms of Connaisseur, there's nothing we can do about that. As mentioned in #1536, we're implementing a caching in 3.4.0, which will hopefully alleviate some of the pain for repeated validations of the same image.

qx121 commented 8 months ago

@Starkteetje thanks for the detailed explanation! Glad that caching feature just got released with 3.4.0, it would help a lot in our case. Also indeed improving registry's locality should help reduce cosign latency.