open-policy-agent / opa

Open Policy Agent (OPA) is an open source, general-purpose policy engine.
https://www.openpolicyagent.org
Apache License 2.0
9.49k stars 1.32k forks source link

OPA request logs do not include query parameters #3741

Open michael-baraboo-datastax opened 3 years ago

michael-baraboo-datastax commented 3 years ago

Hello, we're using k8s liveness and readiness checks on our OPA deployments. For liveness we do a basic call to /health. For readiness we do a call to /health?bundles to check the bundles are loaded. Both API's work as expected, however in the OPA request logging we're unable to distinguish which call was which. It would be nice to know whether a 500 response came from the liveness or readiness check just by filtering on OPA request logs.

Expected Behavior

req_path in OPA request logs contains requests parameters

Actual Behavior

req_path in OPA request logs does not contain request parameters

Steps to Reproduce the Problem

$ docker run -d -it --rm -p 8181:8181 openpolicyagent/opa run --server --addr :8181
07580280e191b9ca8f6e97e5388d1edffe1ec6eef942071568662c2798f23382
$ curl localhost:8181/health
{}
$ curl localhost:8181/health?bundles
{}
$ docker logs 07580280e191b9ca8f6e97e5388d1edffe1ec6eef942071568662c2798f2338282
{"addrs":[":8181"],"diagnostic-addrs":[],"level":"info","msg":"Initializing server.","time":"2021-08-16T21:59:03Z"}
{"client_addr":"172.17.0.1:59874","level":"info","msg":"Received request.","req_id":1,"req_method":"GET","req_path":"/health","time":"2021-08-16T21:59:08Z"}
{"client_addr":"172.17.0.1:59874","level":"info","msg":"Sent response.","req_id":1,"req_method":"GET","req_path":"/health","resp_bytes":2,"resp_duration":0.349175,"resp_status":200,"time":"2021-08-16T21:59:08Z"}
{"client_addr":"172.17.0.1:59878","level":"info","msg":"Received request.","req_id":2,"req_method":"GET","req_path":"/health","time":"2021-08-16T21:59:11Z"}
{"client_addr":"172.17.0.1:59878","level":"info","msg":"Sent response.","req_id":2,"req_method":"GET","req_path":"/health","resp_bytes":2,"resp_duration":0.338907,"resp_status":200,"time":"2021-08-16T21:59:11Z"}

Additional Info

tsandall commented 3 years ago

Thanks for filing this @michael-baraboo-datastax. It would be good to know if you need the query params to be logged at INFO or whether running OPA at DEBUG to troubleshoot would be sufficient.

We currently log query parameters at DEBUG but not at INFO.

Access logging is implemented here. It would be relatively simple to update the logger to include query parameters in the INFO message. The only endpoints that can carry sensitive values in query parameters are:

GET /v1/data?input=<value>
GET /v1/query?q=<value>

In theory those could encode sensitive values. The safest option would be to only log permitted query parameters. We could review the names of query parameters across the OPA API and then skip input and q (and any others that I may have missed).

~@michael-baraboo-datastax -- alternatively, do you need to always have the query parameters logged or would it be fine to have them logged at DEBUG level only? This way you could restart OPA with --log-level=DEBUG if you needed to see them.~

michael-baraboo-datastax commented 3 years ago

Hi @tsandall, thanks for the response.

I was unaware of the parameters being visible on debug logging. It would certainly be an option for us to consider provided it's not prohibitively verbose.

Would enabling debug logging result in additional sensitive values being printed? We're currently outputting decision logs, but that's manageable via sensitive data masking.

tsandall commented 3 years ago

I wouldn't recommend running with DEBUG in production.

I was wondering though... if the /health API is failing (returning a non-2xx response code) wouldn't the caller (i.e., the readiness check or the liveness check) report the failure? Why do you need to look at the OPA logs to determine which one failed? I'd assume K8s (or whatever) reports which check is failing.

Nonetheless, updating the access logging to report the query parameters would be relatively easy. Let us know if this is important.

michael-baraboo-datastax commented 3 years ago

I wouldn't list this as important for exactly the reason you've detailed. We can rely on k8s events to demystify here. I happened to receive only the access logs from someone else who was asking about the 500's and so I didn't have access to the k8s events at the time. Ultimately, it was coming from the call to /health?bundles due to the bundle not being loaded yet, which is an expected temporary scenario for us.

Just wanted to report this in case it'd be considered by others as worthwhile to have. Feel free to leave it as low priority or close it if you think it'd be noisy or too cumbersome to avoid logging secure data at the INFO level. Thanks for all the help!

stale[bot] commented 2 years ago

This issue has been automatically marked as inactive because it has not had any activity in the last 30 days.