jetstack / jetstack-secure

Open source components of Jetstack Secure
https://www.jetstack.io/jetstack-secure/
Apache License 2.0
252 stars 24 forks source link

fix(log): Some improvements to the logs #537

Closed james-w closed 1 month ago

james-w commented 1 month ago

The current logs are somewhat confusing, and missing some very useful information. This is an attempt to clean it up in a few ways:

2024/05/10 16:33:43 retrying in 25.555756126s after error: received response with status code 404. Body:
W0510 16:33:43.832278   10875 reflector.go:535] pkg/mod/k8s.io/client-go@v0.28.3/tools/cache/reflector.go:229: failed to list route.openshift.io/v1, Resource=routes: the server could not find the requested resource

The changes are split into a few logical commits if we want to break it up and review them independently, as I know these might not be the best approaches for all of these things.

maelvls commented 1 month ago

Hey, I've taken a look at your PR. I've went ahead with reviewing this as part of the handover effort started this morning. I'll let Olu do a final review as I don't know much about this code base.

Log a more informative error message when giving up on uploading readings to the server. This would be the last message before the pod exits, so if the pod ends up in CrashLoopBackoff it's important to highlight this as the reason.

~I wasn't able to reproduce the agent giving up~ I was able to reproduce and test that the last log line is the reason why the agent stopped by using a dummy key and waiting for the 15 min backoff to hit:

$ export HTTPS_PROXY=foo
$ go run . agent -c config.yaml --client-id XXXXX -k /tmp/key --venafi-cloud -p 5s 2>&1 | grep api.venafi.cloud
2024/05/14 14:26:57 Posting data to: https://api.venafi.cloud/
2024/05/14 14:26:58 retrying in 41.388139367s after error: post to server failed: failed to execute http request to VaaS. Request https://api.venafi.cloud/v1/oauth/token/serviceaccount, status code: 400, body: [{"error":"invalid_grant","error_description":"token_signature_verification_error"}
2024/05/14 14:27:39 Posting data to: https://api.venafi.cloud/
2024/05/14 14:27:40 retrying in 35.360750201s after error: post to server failed: failed to execute http request to VaaS. Request https://api.venafi.cloud/v1/oauth/token/serviceaccount, status code: 400, body: [{"error":"invalid_grant","error_description":"token_signature_verification_error"}
2024/05/14 14:28:15 Posting data to: https://api.venafi.cloud/
2024/05/14 14:28:15 retrying in 1m38.081853176s after error: post to server failed: failed to execute http request to VaaS. Request https://api.venafi.cloud/v1/oauth/token/serviceaccount, status code: 400, body: [{"error":"invalid_grant","error_description":"token_signature_verification_error"}
2024/05/14 14:29:53 Posting data to: https://api.venafi.cloud/
2024/05/14 14:29:54 retrying in 1m37.808502147s after error: post to server failed: failed to execute http request to VaaS. Request https://api.venafi.cloud/v1/oauth/token/serviceaccount, status code: 400, body: [{"error":"invalid_grant","error_description":"token_signature_verification_error"}
2024/05/14 14:31:31 Posting data to: https://api.venafi.cloud/
2024/05/14 14:31:32 retrying in 3m17.774884905s after error: post to server failed: failed to execute http request to VaaS. Request https://api.venafi.cloud/v1/oauth/token/serviceaccount, status code: 400, body: [{"error":"invalid_grant","error_description":"token_signature_verification_error"}
2024/05/14 14:34:50 Posting data to: https://api.venafi.clou
2024/05/14 14:34:50 retrying in 3m34.920269392s after error: post to server failed: failed to execute http request to VaaS. Request https://api.venafi.cloud/v1/oauth/token/serviceaccount, status code: 400, body: [{"error":"invalid_grant","error_description":"token_signature_verification_error"}
2024/05/14 14:38:25 Posting data to: https://api.venafi.cloud/
2024/05/14 14:38:25 Exiting due to fatal error uploading: post to server failed: failed to execute http request to VaaS. Request https://api.venafi.cloud/v1/oauth/token/serviceaccount, status code: 400, body: [{"error":"invalid_grant","error_description":"token_signature_verification_error"}

(I've hidden all the useless messages about missing resources)

About the REST config bug, well spotted. I wasn't able to reproduce the panic (if that's how you spotted this)

Anyways, I think all the changes you made are sensible.

james-w commented 1 month ago

Hey, I've taken a look at your PR. I've went ahead with reviewing this as part of the handover effort started this morning. I'll let Olu do a final review as I don't know much about this code base.

Thanks!

Log a more informative error message when giving up on uploading readings to the server. This would be the last message before the pod exits, so if the pod ends up in CrashLoopBackoff it's important to highlight this as the reason.

I wasn't able to reproduce the agent giving up, or I haven't waited long enough, I guess the backoff threshold is high:

I believe the default timeout is 15m before it gives up.

About the REST config bug, well spotted. I wasn't able to reproduce the panic (if that's how you spotted this)

It was. Running locally without a kubeconfig file should reproduce.

Anyways, I think all the changes you made are sensible.

Thanks.

james-w commented 1 month ago

Thanks all, I pushed an update with the suggested changes.

tfadeyi commented 1 month ago

Thank you :+1: LGTM, I'll merge the changes to master