apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.81k stars 4.23k forks source link

[Feature Request][Go SDK]: Allow setting Log Level for pipelines. #26107

Open lostluck opened 1 year ago

lostluck commented 1 year ago

What would you like to happen?

The SDK isn't able to specific log levels to filter down into at the present time (noted by Dataflow documentation https://cloud.google.com/dataflow/docs/guides/logging#SettingLevels at least).

This can cause cost overruns or exess logs being filtered from backends like Cloud Logging.

The SDK should be able to set the minimum log level with a flag, or a programmatic option, like via the harnessopts package used for SideInputCache capacity, or for setting sampler frequency or heap dumps:

https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/util/harnessopts/sampler.go

https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/util/harnessopts/heap_dump.go

Issue Priority

Priority: 2 (default / most feature requests should be filed as P2)

Issue Components

Bizetremi commented 4 months ago

Hello,

Any workaround we can use to suppress the harness debug logs until this feature is available? The excess logs are causing Cloud Logging to throttle worker logs.

lostluck commented 4 months ago

@Bizetremi Which debug logs are you still seeing that are noisy? What Go SDK version are you using?

Presently this work isn't scheduled or planned for completion (hence no assignees). We would welcome a contribution to fix the issue. The 2.57.0 version cut is on the 29th of May, so a merged PR before then would be in the next release.

There shouldn't be any "ongoing" debug logs once a worker starts up these days, as last year we removed the last per-bundle noisy log. So specificity would be valuable.

There isn't a workaround presently. Short of disabling the Beam remote logging entirely, which would prevent any beam logging from workers at all, associated with the job.

That would be accomplished with hooks.DisableHook(harness.DefaultRemoteLoggingHook) as described on the comment on that constant Discoverable, I know.

Bizetremi commented 3 months ago

Hello @lostluck

Thank you for the reply and highlighting the changes in recent versions. We resolved the issue by upgrading the GO SDK from v2.46.0 to 2.56.0