soynatan / django-easy-audit

Yet another Django audit log app, hopefully the simplest one.
GNU General Public License v3.0
712 stars 178 forks source link

Some of the information not audit #123

Open guyage opened 4 years ago

guyage commented 4 years ago

easyaudit_requestevent query_strings, user_id value is empty, easyaudit_loginevent only record the login error, no log in successfully

jheld commented 4 years ago

Can you provide more system details?

dee0 commented 4 years ago

Might be seeing the same thing. However in my case it is just the user_id that I am not seeing.

ai=# select * from easyaudit_requestevent where url like '%score%' order by datetime desc limit 30;
   id   |       url        | method | query_string | remote_ip  |           datetime            | user_id
--------+------------------+--------+--------------+------------+-------------------------------+---------
 865001 | /ai/scorecard/3/ | GET    |              | 10.255.0.2 | 2020-02-07 19:12:27.714112+00 |
 864989 | /ai/scorecard/3/ | GET    |              | 10.255.0.2 | 2020-02-07 19:12:04.364695+00 |      53

I am user 53. I was working with a colleague who has user id 18. That first record should have a user_id of 18.

This is the Django log entry from his request someinstance@somehost | aaa.bbb.ccc.ddd - - [07/Feb/2020:19:12:28 +0000] "GET /ai/scorecard/3/ HTTP/1.0" 200 310581 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_1) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.0.3 Safari/605.1.15"

So far the only thing I have determined that is different is between his request and mine is that he is on a Mac with the browser version you see above while I am on Windows 10 with version 537.36.

That last time his id was recorded was back on 2019-10-31. For some reason after that it stopped being recorded. Fwiw, how we discovered this was that I removed a page he cares about because I didn't see any non-anonymous requests for it in over two months. He complained and I started investigating. Btw, anonymous access to the scorecard pages isn't allowed.

Based on the contents of our request log, I believe the problem happened between 2019-10-31 and 2019-12-05. Between those two points we upgraded our stack. I have attached the pip input 'requirements.txt' versions for both before and after when I think the problem started as well as our current version. Of note is that it doesn't look like the problem was introduced by an upgrade of easyaudit

One other oddity, something that suggests this might not be an easyaudit problem,

requirements-2020-10-24.txt requirements-2020-11-12.txt requirements-2020-02-07.txt

dee0 commented 4 years ago

Fwiw, in the debugsqlshell I made calls to get_user_model and session.get_decoded().get('_auth_user_id'). They seemed to work fine for user 18 ( the user I mentioned above )

dee0 commented 4 years ago

Just a comment, couldn't the code use get_user_model().objects.filter( id = 18 ).exists() instead of the combo of get_user_model().objects.get(id=user_id) and user_id=getattr(user, 'id', None) ?

You have the id already so there is no need to fetch the model instance and then retrieve it from there. You just need to know if the instance exists.

jheld commented 4 years ago

There is some possibility to simplify the ID code, yes.

Some of the caution is because the library can be used where easyaudit is on a different database so user & content_type IDs must be dereferenced explictly for filters.

For the getattr, I think I just ended up writing that code fast (and keeping safety).

Not exactly sure what the snippet you're talking about is for yet, and explictly regarding the hard-coded IDs.

dee0 commented 4 years ago

I instrumented request_signals.py like so

        if session_cookie_name in cookie:
            session_id = cookie[session_cookie_name].value

            try:
                session = Session.objects.get(session_key=session_id)
            except Session.DoesNotExist:
                session = None

            if session:
                user_id = session.get_decoded().get('_auth_user_id')
                try:
                    user = get_user_model().objects.get(id=user_id)
                except:
                    user = None
                if extraLogging and not user :
                    logger.warning( f'DAN: User with id {user_id} not found.' )
            elif extraLogging :
                logger.warning( f'DAN: Session {session_id} not found.' )
        elif extraLogging :
            logger.warning( f'DAN: Session cookie {settings.SESSION_COOKIE_NAME} not found' )

where extraLogging is True if 'scorecard' is in the URL.

My colleague again accessed the scorecard page and I found that the session cookie isn't there. That is I am seeing the message DAN: Session cookie sessionid not found in the logs.

He's using Chrome, and so he brought the devtools so we could see what cookies were being sent from the browser. There is a sessionid cookie. Nothing seems odd about it. Per his browser, it has an expiration date of Feb 24th 2020, which makes sense given that he was required to login today and we have the default session timeout of 2 weeks in place.

Of note... I checked the django_session table and his session is not there. That is a ran a query to fetch the session record that has the session_key value that is showing in his browser and no records came back.

So... I think that is probably the reason easyaudit isn't logging his user id. Although at the moment I have no idea why django isn't creating a django_session record for him.

dee0 commented 4 years ago

I updated my adhoc logging to dump the environment dictionary that easy audit receives. I can see there is a sessionid in the cookie value in the env however despite this the condition

if session_cookie_name in cookie:

is not being met. Fwiw however I do see a matching record in the table django_session. So it seems something is awry with the cookie processing. I cannot post the cookie data I have here so I am going to try and debug it a bit locally to see if I can come up with something I can actually share.

dee0 commented 4 years ago

Looks like for some reason SimpleCookie is choking on a cookie value that happens to have a space in it. Something like this csrftoken=J7xbB0mM3EOI8UvHLK9L6CmQjbmhTdTYNaYAikNHWlBw6aK4bk2hPeTgoIcQdihk; sessionid=0s59ohck56habw7dqn6dynomoikt0h50; ThingReg=Joe Bloe|jbloe@example.com; mpthing=mpencodedvalue

The space between Joe and Bloe seems to be a problem. SimpleCookie extracts ( 2, 'ThingReg', ( 'Joe', 'Joe' ) ) and then treats Bloe|jbloe@example.com as a key and since it has no value ( because of the semicolon that comes next ) SimpleCookie aborts and tosses everything it has extracted so far. So it is empty.

At the moment I am not sure if this is a problem with how SimpleCookie is being used, a problem with SimpleCookie itself, or a case of bad data coming from Chrome.

dee0 commented 4 years ago

It looks like if you were to use parse_cookie from django.http.cookie then this problem goes away.

Btw I see here https://developer.mozilla.org/fr/docs/Web/HTTP/Headers/Set-Cookie that the cookie value shouldn't contain a space. However at the moment I don't know if the browser url encoded that space and something else decoded it prior to easy audit being called or if the browser is sending bad data.

Anyhoo, it seems that parse_cookie is much less finicky than SimpleCookie.

dee0 commented 4 years ago

Oh, and it looks like why my colleague is able to authenticate even though SimpleCookie is having a problem is that

jheld commented 4 years ago

Hi @dee0 I'm sorry I haven't had a chance to review or debug this. Can you comment with a smaller set of steps/description from your latest debugging?

dee0 commented 4 years ago

Sure, I can try and summarize. If SimpleCookie encounters a cookie value with a space ignores all cookies in the request. The fix is to just use parse_cookie. Fwiw, the reason authentication is working even though audit logging is having a problem identifying the user is that wsgi handler uses parse_cookie not SimpleCookie.