abcxyz / github-metrics-aggregator

Apache License 2.0
11 stars 3 forks source link

Switch to use beam sdk logger #171

Closed lock14 closed 8 months ago

lock14 commented 8 months ago
sethvargo commented 8 months ago

Hey @lock14 - can you try using the text logger instead of the JSON logger?

logger := logging.New(os.Stderr, slog.LevelDebug, logging.FormatText, true)

That should output a similar-ish format. I'm perplexed about how it's parsing those text strings; must be a custom fluentd parser. If you're using NewFromEnv, you can also just set LOG_FORMAT=text and restart.

yolocs commented 8 months ago

Beam seems to log to a beam specific logging server. It would be quite unfortunately if we need to diverge on the logging story

lock14 commented 8 months ago

@sethvargo I've tried various configuration of instantiating the pkg/loggin logger and they all show as severity INFO in GCP regardless of the logging level. logging test code: image Dataflow GCP log window part1: image Dataflow GCP log window part2: image

lock14 commented 8 months ago

Examinging the logs a bit more, the beam/log logger appears to have a lot more fields in the jsonPayload image Whereas the pkg/logging logger only has message and stream fields in it's jsonPayload: image

sethvargo commented 8 months ago

This is super weird, but probably not worth spending more time debugging. Can you document why we're using a different logging inline in the code?

lock14 commented 8 months ago

This is super weird, but probably not worth spending more time debugging. Can you document why we're using a different logging inline in the code?

done.

lock14 commented 8 months ago

This is just a hunch, but I suspect that the beam sdk may be using this function to instantiate a specially configured logger when launched on a dataflow runner: https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/log/log.go#L62

I don't have any concrete evidence for this, but I thought I'd mention it in case someone else ever want to come back to this and dig some more.

lock14 commented 8 months ago

It seems beam uses a different entirely different logging API. I can't really tell at which stage the logs get converted and put into cloud logging.

Could you update https://github.com/abcxyz/readability/wiki/Go-Best-Practices#logging to call out dataflow would need to use beam logging instead (and probably link to this PR for the findings you got). Thanks!

Updated the docs.