readthedocs / ext-theme

Read the Docs drop in replacement site templates
2 stars 2 forks source link

Projects: new regression on project listing response time #463

Open agjohnson opened 2 weeks ago

agjohnson commented 2 weeks ago

It looks like it's the build or version query on the project listing now. If you look at the SQL queries in the prod debug instance, you'll see a couple build queries taking the most time now.

Here's an explain:

QUERY PLAN
Sort  (cost=4797211.16..4797758.59 rows=218972 width=336) (actual time=5702.557..5702.562 rows=15 loops=1)
  Sort Key: builds_build.date DESC
  Sort Method: quicksort  Memory: 37kB
  ->  Hash Left Join  (cost=215123.68..4710426.92 rows=218972 width=336) (actual time=5512.958..5702.524 rows=15 loops=1)
        Hash Cond: (builds_build.version_id = builds_version.id)
        Filter: (((builds_version.type)::text <> 'external'::text) OR (builds_version.type IS NULL))
        ->  Index Scan using builds_build_project_id on builds_build  (cost=0.56..4455549.77 rows=228962 width=336) (actual time=2.628..4028.505 rows=15 loops=1)
              Index Cond: (project_id = ANY ('{487639,689368,74581,815321,256207,714226,521174,24458,170010,613422,451683,17662,233368,527062,489923}'::integer[]))
              Filter: (SubPlan 1)
              Rows Removed by Filter: 442104
              SubPlan 1
                ->  Limit  (cost=0.99..13.49 rows=1 width=12) (actual time=0.008..0.008 rows=1 loops=442119)
                      ->  Nested Loop Left Join  (cost=0.99..9109.11 rows=729 width=12) (actual time=0.008..0.008 rows=1 loops=442119)
                            Filter: (((u1.type)::text <> 'external'::text) OR (u1.type IS NULL))
                            Rows Removed by Filter: 0
                            ->  Index Scan Backward using builds_buil_project_fea68c_idx on builds_build u0  (cost=0.56..2702.59 rows=762 width=16) (actual time=0.006..0.006 rows=1 loops=442119)
                                  Index Cond: (project_id = builds_build.project_id)
                            ->  Index Scan using builds_version_pkey on builds_version u1  (cost=0.43..8.40 rows=1 width=9) (actual time=0.002..0.002 rows=1 loops=464569)
                                  Index Cond: (id = u0.version_id)
        ->  Hash  (cost=147380.72..147380.72 rows=3897072 width=9) (actual time=1568.329..1568.330 rows=3907920 loops=1)
              Buckets: 131072  Batches: 64  Memory Usage: 3517kB
              ->  Seq Scan on builds_version  (cost=0.00..147380.72 rows=3897072 width=9) (actual time=0.007..838.935 rows=3907920 loops=1)
Planning Time: 0.862 ms
Execution Time: 5702.621 ms
humitos commented 2 weeks ago

Taking a quick look at it, I'd say we need an index for this filter

Filter: (((builds_version.type)::text <> 'external'::text) OR (builds_version.type IS NULL))

If I understand correctly, there is when it takes the most time.

humitos commented 2 weeks ago

Do you know what's the code that executes this query? @agjohnson

agjohnson commented 2 weeks ago

I don't, at least not without looking into this more. I wouldn't be surprised if it ended up being triggered from the templates. I wasn't able to replicate this locally, so prod might be the easiest place to try to debug this more.

New Relic might be most helpful, it can probably help align the code and query.

agjohnson commented 2 weeks ago

Note: I got this from SQL tab debug toolbar.

agjohnson commented 2 weeks ago

I also just noticed a similar really slow page load (22s) on the version listing view:

https://app.readthedocs.org/projects/test-builds/?slug=timetest&privacy=&visibility=&sort=

Not sure if that is related yet.

Interesting, but I expected this to fail to load, but it was instantaneous:

https://app.readthedocs.org/projects/test-builds/builds/?version__slug=timetest

humitos commented 2 weeks ago

I also just noticed a similar really slow page load (22s) on the version listing view:

Wow! It takes 34s in my case 💣