aws / aws-application-networking-k8s

A Kubernetes controller for Amazon VPC Lattice
https://www.gateway-api-controller.eks.aws.dev/
Apache License 2.0
157 stars 48 forks source link

Add Trace IDs and metadata to log lines #648

Closed shulin-sq closed 2 weeks ago

shulin-sq commented 4 weeks ago

What type of PR is this? feature (?)

Which issue does this PR fix: https://github.com/aws/aws-application-networking-k8s/issues/603

What does this PR do / Why do we need it: This PR adds the ability to track trace id and metadata to all log lines. It also changes the log interface so that in most cases, developers are forced to add a context instance to the log line.

The previous PR was here: https://github.com/aws/aws-application-networking-k8s/pull/624 but was based on an older release version. I decided to recreate the branch on top of main.

This PR also addresses some of the comments brought up in the previous PR

Additional resources:

Testing done on this change:

ran

go test ./...

and also ran a built version in my test EKS cluster.

Automation added to e2e:

no

Will this PR introduce any new dependencies?:

no

Will this break upgrades or downgrades. Has updating a running cluster been tested?:

no, will not break since it's a change to do with the logging interface

Does this PR introduce any user-facing change?:

no

How to use:

For all Reconcile I have added something like this at the beginning of the method

    ctx = gwlog.StartReconcileTrace(ctx, c.log, "iamauthpolicy", req.Name)
    defer func() {
        gwlog.EndReconcileTrace(ctx, c.log)
    }()

this should produce a log line like so:

{"level":"info","ts":"2024-06-07T06:27:53.311Z","logger":"controller.iam-auth-policy","caller":"gwlog/metadata.go:79","msg":"[ACTION_RECONCILE_START]","type":"iamauthpolicy","name":"foo-test","trace_id":"3442054d-0f06-46ac-84bb-7e0112cfa60d"}
{"level":"info","ts":"2024-06-07T06:27:53.311Z","logger":"controller.iam-auth-policy","caller":"controllers/iamauthpolicy_controller.go:92","msg":"reconcile IAM policy","req":"lattice/foo-test","targetRef":{"group":"gateway.networking.k8s.io","kind":"HTTPRoute","name":"foo-test-all
","namespace":"foo-test"},"trace_id":"3442054d-0f06-46ac-84bb-7e0112cfa60d"}
{"level":"info","ts":"2024-06-07T06:27:53.333Z","logger":"controller.iam-auth-policy","caller":"controllers/iamauthpolicy_controller.go:110","msg":"reconciled IAM policy","req":"lattice/foo-test","targetRef":{"group":"gateway.networking.k8s.io","kind":"HTTPRoute","name":"foo-test-a
ll","namespace":"foo-test"},"isDeleted":false,"trace_id":"3442054d-0f06-46ac-84bb-7e0112cfa60d"}
{"level":"info","ts":"2024-06-07T06:27:53.333Z","logger":"controller.iam-auth-policy","caller":"gwlog/metadata.go:85","msg":"[ACTION_RECONCILE_END]","type":"iamauthpolicy","name":"foo-test","trace_id":"3442054d-0f06-46ac-84bb-7e0112cfa60d"}

Something to do next, but not as part of this PR as I do not want to conflict with future changes that want to add more log lines is that for errors you can add why a reconcile failed as part of the metadata.

for example in TargetGroupPolicyController you can change this logic to

    _, err = c.ph.ValidateAndUpdateCondition(ctx, tgPolicy)
    if err != nil {
        gwlog.AddMetadata(ctx, "failed_reason", err.Error())
        return ctrl.Result{}, err
    }

and then the failed_reason will show up in the [ACTION_RECONCILE_END] message.

You can still keep using the original logger as it's exposed as InnerLogger in TracedLogger.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

erikfuller commented 4 weeks ago

you can see that trace 3442054d-0f06-46ac-84bb-7e0112cfa60d is separate from 3442054d-0f06-46ac-84bb-7e0112cfa60d

Just checking this is what you meant as these ids are the same?

shulin-sq commented 3 weeks ago

@erikfuller oops you're correct. I've updated my description.

I tried looking through my logs to see if I Can find a better example but I have very low traffic in my dev env so I don't see a lot of overlapping actions. However I can confirm that the uuid is correctly generated as unique per action

{"level":"info","ts":"2024-06-10T16:36:21.798Z","logger":"controller.gateway","caller":"gwlog/metadata.go:85","msg":"[ACTION_RECONCILE_END]","type":
"gateway","name":"snarl-10-isglf3puxg","trace_id":"b952b248-bf75-49d0-8984-311374ad9012"}
{"level":"info","ts":"2024-06-10T16:36:21.801Z","logger":"controller.gateway","caller":"gwlog/metadata.go:79","msg":"[ACTION_RECONCILE_START]","type
":"gateway","name":"shulilntest","trace_id":"287c9b37-faaf-4429-8b8c-a85dbb778de3"}
shulin-sq commented 2 weeks ago

@erikfuller I updated the PR, please let me know if there are further requested modifications