solo-io / gloo

The Cloud-Native API Gateway and AI Gateway
https://docs.solo.io/
Apache License 2.0
4.09k stars 446 forks source link

Access Log Filter `runtime_key` field behaves unexpectedly #9995

Open jbohanon opened 2 months ago

jbohanon commented 2 months ago

Gloo Edge Product

Open Source

Gloo Edge Version

v1.17.0

Kubernetes Version

N/A

Describe the bug

The implementation of AccessLogFilters implicitly relies on the default value unless the end user defines the runtime values at the configured key themselves. As the default field is not required, and we compare against 0 in the absence of a value, this results in the access logging filter for status code being silently inert without a default set. Our guide currently does not show a default set. fixed in #9998

Expected Behavior

Configuring should not be so obtuse. The default field should be required, or we should set the runtime key based on the user config (non-trivial, probably bad) <-- this is a bad idea and we should not do it.

At a MINIMUM, the documentation should be updated to indicate this gotcha.

Steps to reproduce the bug

  1. Install Gloo Edge 1.17.X
  2. Follow the guide for access log filtering based on 400+ status code here
  3. Note that it is ineffective.
  4. Add default: 400 as a sibling to runtimeKey: "400" and note that it now works as expected.
  5. Remove runtimeKey and note that it still works as expected.

I suspect that if we went in and manually set a runtime key access_log_status_filter: 400 in the running envoy instance we would be able to use that key via runtimeKey: "access_log_status_filter" but I have not tested this.

Additional Environment Detail

No response

Additional Context

Envoy code where comparison occurs: https://github.com/envoyproxy/envoy/blob/v1.30.4/source/common/access_log/access_log_impl.cc#L36-L41

Implementation PR of AccessLogFilters: https://github.com/solo-io/gloo/pull/7627

Internal slack: https://solo-io-corp.slack.com/archives/CEDCS8TAP/p1725636045571749

jbohanon commented 1 month ago

This has been largely mitigated by the documentation update in #9998, probably enough to resolve this issue given the minimal value:high investment ratio of implementing a more permanent fix that would definitely be a breaking change.