profusion / sgqlc

Simple GraphQL Client
https://sgqlc.readthedocs.io/
ISC License
506 stars 85 forks source link

HTTP endpoint class should not include Authorization headers in __str__ method #201

Open buffyg opened 2 years ago

buffyg commented 2 years ago

Logging Authorization headers leaks credentials into logs (for Bearer tokens, see https://www.rfc-editor.org/rfc/rfc6750.html#section-5.3, first item recommendation, "Safeguard bearer tokens"). Suggest converting base_headers into a dictionary comprehension that redacts 'Authorization' at sgqlc/endpoint/http.py:104, as in:

{k: v for k, v in self.base_headers.items() if k != "Authorization"}

barbieri commented 2 years ago

sorry but that is really useful during debug of authentication issues, so I'd keep those and suggest you to redact the logs if you can't trust the logging storage.

You can do that with projects such as https://pypi.org/project/logredactor/ or you can do a filter manually: https://relaxdiego.com/2014/07/logging-in-python.html

also note that the __str__ is not the only place where you can have the header, if you get an error, _log_http_error() will also do a loop printing them on lines https://github.com/profusion/sgqlc/blob/master/sgqlc/endpoint/http.py#L214-L215

let me know if you agree, so I can close the bug as wontfix

buffyg commented 2 years ago

If you want logging that leaks, making it explicit that you are doing so, don't make it the default behaviour. Your response doesn't at all acknowledge that this is RFC-busting credential leakage. It's useful to developers and attackers alike. If you are logging credentials, document the behaviour so that developers know that it's happening, otherwise suggesting how to safely handle a problem that's not documented is malpractice.

barbieri commented 2 years ago

@buffyg ok, but following on your idea, I cannot log any header value since Authorization is far from being the only header for such purpose. See that I don't have any special handling for Authorization, many servers use X-API-Key, X-Client-Id and many other variants (without X-, with _ instead of -, so on).

Then if I try to fix all of them I'll likely miss one, this is why I recommended you to redact the logs based on your (albeit common, it's one of many possible headers that can't leak).

I think the easiest solution is to omit the headers in logging altogether, log only the keys or declare some redacted_headers = set() that we'll check before outputting, default to Authorization (but this is the solution I like less, in particular since it's case insensitive and so on).