soynatan / django-easy-audit

Yet another Django audit log app, hopefully the simplest one.
GNU General Public License v3.0
735 stars 182 forks source link

Poor performance due to datetime column is not indexed #239

Closed 3cp closed 1 year ago

3cp commented 2 years ago

Hi, we noticed very slow performance on showing the logs in production. This is due to the default sorting on "datetime" column, but it's not indexed.

Normally people would not notice this issue unless they have massive logs like us.

Please add the index. Thx!

jheld commented 2 years ago

When I read this, it did surprise me a bit at first.

Curious, what kinds of queries are you performing where you see the slowdown?

https://github.com/soynatan/django-easy-audit/blob/master/easyaudit/models.py#L60 shows the index is composite, object_id and then content_type_id.

So, for any specific object id (and in tandem for that kind of object, e.g. the content type), is that one of the slow queries?

I don't disagree that adding an index including the datetime column would be overall a bad thing -- that said, i do want to make sure that we're reasonably certain that by doing so we wouldn't cause a regression in performance for queries which would grab the time indexed field when it would have been much faster to grab from the obj id & ct id index.

3cp commented 2 years ago

Not hand made query, it's just the default django admin ui showing paginated result. Note you have default ordering on "datetime" column which is not indexed.

I would say it's pretty common to sort logs based on time, or search logs based on time. It's bit surprised to me that column is not indexed.

jheld commented 2 years ago

Yes, that's what I'm asking about. For all changes for a particular object on a model/table, i guess i would be surprised if the django admin pagination was particularly slow. The total count and total matching count queries, however, potentially could be slow and there are ways to get around that which do not require changes to the schema.

Pagination for a particular content type only -- definitely. And similarly for all objects matching a given id (that is, if you have 5 tables and you are looking at the paginated list of crud event for objects with ID 19 -- how useful/meaningful useful that is is a different story).

Would you be willing to install django debug toolbar, load the admin page in question, and post the query timing and generated sql/query plan explain?

Again, we can add another index (ideally a composite index, but am open to it being just the one column), but I am curious about you are experiencing.

3cp commented 2 years ago

Sure, I will try to load a dump of production table to my local server for the debug.

3cp commented 2 years ago

image

The generated SQL looks expected.

SQL explained
Executed SQL
SELECT "easyaudit_requestevent"."id",
       "easyaudit_requestevent"."url",
       "easyaudit_requestevent"."method",
       "easyaudit_requestevent"."query_string",
       "easyaudit_requestevent"."user_id",
       "easyaudit_requestevent"."remote_ip",
       "easyaudit_requestevent"."datetime"
  FROM "easyaudit_requestevent"
 ORDER BY "easyaudit_requestevent"."datetime" DESC, "easyaudit_requestevent"."id" DESC
 LIMIT 100
Time
3887.160062789917 ms
Database
default
QUERY PLAN
Limit  (cost=658040.85..658052.52 rows=100 width=28) (actual time=4793.420..4793.549 rows=100 loops=1)
  ->  Gather Merge  (cost=658040.85..3016629.19 rows=20215060 width=28) (actual time=4793.418..4793.539 rows=100 loops=1)
        Workers Planned: 2
        Workers Launched: 2
        ->  Sort  (cost=657040.83..682309.65 rows=10107530 width=28) (actual time=4781.120..4781.125 rows=81 loops=3)
              Sort Key: datetime DESC, id DESC
              Sort Method: top-N heapsort  Memory: 49kB
              Worker 0:  Sort Method: top-N heapsort  Memory: 68kB
              Worker 1:  Sort Method: top-N heapsort  Memory: 56kB
              ->  Parallel Seq Scan on easyaudit_requestevent  (cost=0.00..270738.30 rows=10107530 width=28) (actual time=6.613..2170.459 rows=8093324 loops=3)
Planning Time: 0.312 ms
Execution Time: 4795.226 ms
3cp commented 2 years ago

For reference, after I manually indexed datetime column, it became normal. Using the new index as expected.

SQL explained
Executed SQL
SELECT "easyaudit_requestevent"."id",
       "easyaudit_requestevent"."url",
       "easyaudit_requestevent"."method",
       "easyaudit_requestevent"."query_string",
       "easyaudit_requestevent"."user_id",
       "easyaudit_requestevent"."remote_ip",
       "easyaudit_requestevent"."datetime"
  FROM "easyaudit_requestevent"
 ORDER BY "easyaudit_requestevent"."datetime" DESC, "easyaudit_requestevent"."id" DESC
 LIMIT 100
Time
0.9999275207519531 ms
Database
default
QUERY PLAN
Limit  (cost=0.48..8.28 rows=100 width=28) (actual time=0.095..0.154 rows=100 loops=1)
  ->  Incremental Sort  (cost=0.48..1892816.63 rows=24279976 width=28) (actual time=0.095..0.148 rows=100 loops=1)
        Sort Key: datetime DESC, id DESC
        Presorted Key: datetime
        Full-sort Groups: 4  Sort Method: quicksort  Average Memory: 28kB  Peak Memory: 28kB
        ->  Index Scan Backward using re_datetime_idx on easyaudit_requestevent  (cost=0.44..800217.71 rows=24279976 width=28) (actual time=0.015..0.040 rows=101 loops=1)
Planning Time: 0.246 ms
Execution Time: 0.183 ms
jheld commented 2 years ago

Thank you for working through that.

I didn't realize you were accessing the request event objects (as opposed to crud event). Yes definitely it makes sense to add that column index. Potentially we would also want to do so as part of another index to make it a composite one but for now that seems outside the scope.

Can you submit a PR with your migration?

3cp commented 2 years ago

Sure. Will do. Add index on datetime on all 3 models?

jheld commented 2 years ago

I'm a bit hesitant to do it on crud event just yet unless others chime in with support or experience with having tried it themselves (e.g. i don't want to negatively affect performance for anyone for queries which are already pretty performant, as can happen when there are more indexes to choose from).

likewise i suppose i'm not surprised that no one has mentioned that the login event is slow given that login events tend to be significantly less write heavy and thus although that table can be big, it should never be anywhere near the size of the others.

so for now, just request event.

steverecio commented 1 year ago

Just came here to create an issue based on the same problem I'm having and happy to see someone else with the same issue. Django admin is incredibly slow due to the datetime column being used for ordering but not being indexed.

Is there anything holding up #240 from being merged? Ideally the CRUDEvent table would be indexed as well (that admin page throws a 504 timeout every time).

diogosilva30 commented 1 year ago

Commenting to report that CRUDEvent page is also unusable. Our production environment has +1 million logs and we if try to search anything our website crashes.

jheld commented 1 year ago

I do want to be careful about changes to the schema which are solely for issues relating to usage on Django admin. Small targeted changes (e.g. to one model in this case) are a safer bet as a practice when possible even if it means more PRs .

steverecio commented 1 year ago

Could we just change the default ordering to ID instead of datetime? The datetime is auto_now_add so shouldn't ID provide the same sort order anyways?

This would speed up CRUDEvent too without requiring an extra index.