dabapps / django-log-request-id

Django middleware and log filter to attach a unique ID to every log message generated as part of a request
BSD 2-Clause "Simplified" License
371 stars 64 forks source link

Library leads to Vary: Cookie header set #64

Closed amureki closed 1 year ago

amureki commented 2 years ago

Greetings,

thank you for the lovely package! After checking current get_log_message, I am seeing that we will always try to grab a user attribute (and fall back to user.pk or user.id if no custom attribute was set).

And there is an interesting side effect coming from that fact:

This logic means that even if we serve some simple static request that is not doing anything with the user instance - django-log-request-id middleware will still lead to vary cookie be set.

So, in short, that means, that logging library is changing response headers, which is an arguable behaviour.

I'd love to get your opinion on this first and then, if valid, continue discussion with possible mitigation options and improvements we could make.

Best, Rust

codingjoe commented 2 years ago

I agree with the sentiment. It's probably not and intended and certainly not an expeced side effect to inject a header when installing a log library.

Maybe it doesn't even make sense to log a user id, if the user wasn't touched and didn't matter for the response.

In any case, there should be an option do disable this behavior all together, like changing the setting to false or none.

RealOrangeOne commented 2 years ago

It looks like there's a LOG_USER_ATTRIBUTE setting to change what attribute is logged. Perhaps if this is explicitly set to False or None, logging of the user attribute is skipped. That way the user doesn't need to be retrieved, and so there's no Vary header.

This does have the downside of disabling the user logging for all views, rather just non-authed views.

Presumably you have the need for a session, but not full user authentication? This does feel like a fairly niche case.

amureki commented 2 years ago

@RealOrangeOne thanks for the reply!

Changing LOG_USER_ATTRIBUTE behaviour might improve here, true.

I can imagine something like this:

from django.contrib.auth.models import AnonymousUser

def get_log_message(self, request, response):
    message = 'method=%s path=%s status=%s' % (request.method, request.path, response.status_code)

    user = getattr(request, 'user', None)
    # if there is no user, or it is anonymous, instead of touching session we directly return message
    if not user or type(user) == AnonymousUser:
        return message

    # `LOG_USER_ATTRIBUTE_SETTING` accepts False/None to skip setting attribute
    #  but falls back to 'pk' if value is not set
    user_attribute = getattr(settings, 'LOG_USER_ATTRIBUTE_SETTING', 'pk')
    if user_attribute:
        user_id = getattr(user, user_attribute, None) or getattr(user, 'id', None)
        message += ' user=' + str(user_id)
    return message

Two changes here compared to the original logic:

What do you think?

RealOrangeOne commented 2 years ago

From what you mentioned above, the AnonymousUser check will still touch the session, as request.user needs to be populated from the session, which involves reading from it.

As for the rest of the change, IMO I think that makes sense. I'm sure the specifics could be nailed down with a PR.

amureki commented 2 years ago

@RealOrangeOne thanks for the valid note!

I added PR attempting to fix this: https://github.com/dabapps/django-log-request-id/pull/65

Would be happy to get feedback on everything: logic, naming, styling (as I am not familiar with this codebase yet).

j4mie commented 2 years ago

Hello! Thanks fort this issue and the PR. Just to let you know I'm not ignoring this, just very busy currently! I'll get to it when I can.