cityofaustin / techstack

Project management for the City of Austin's new digital service delivery platform, Austin.gov.
11 stars 3 forks source link

Speed up OfficialDocumentPages query #4657

Open Niicck opened 4 years ago

Niicck commented 4 years ago

We don't have indexes on our getOfficialDocumentsCollectionDocumentsQuery, that's why it times out.

So we need to make this query more efficient, and add the appropriate (janisbasepage_ptr_id, -date) index in such a way that it gets picked up by graphql => django => sql.

Maybe we need a data loader thing. https://apirobot.me/posts/django-graphql-solving-n-1-problem-using-dataloaders

Niicck commented 4 years ago

We can see what sql is getting executed by running this graphql query:

  query getOfficialDocumentsCollectionDocumentsQuery($id: ID, $after: String) {
    officialDocumentCollectionDocuments(officialDocumentCollection: $id, orderBy: "-page__date", first: 25, after: $after) {
      edges {
        node {
          page {
            id
            date
            live
            slug
            title
            authoringOffice
            summary
            name
            document {
              fileSize
              filename
              url
            }
            departments {
              slug
            }
          }
        }
      }
      pageInfo {
        hasNextPage
        endCursor
      }
    }
    __debug {
      sql {
        sql
      }
    }
  }

The output gives us this query (you get an "offset" clause when you're running batches with $after):

SELECT "official_documents_page_officialdocumentcollectionofficialdcac4"."id", "official_documents_page_officialdocumentcollectionofficialdcac4"."page_id", "official_documents_page_officialdocumentcollectionofficialdcac4"."official_document_collection_id" 
FROM "official_documents_page_officialdocumentcollectionofficialdcac4" 
INNER JOIN "official_documents_page_officialdocumentpage" ON ("official_documents_page_officialdocumentcollectionofficialdcac4"."page_id" = "official_documents_page_officialdocumentpage"."janisbasepage_ptr_id") 
ORDER BY "official_documents_page_officialdocumentpage"."date" DESC  LIMIT 25 OFFSET 50

Run the query explainer:

EXPLAIN ANALYZE SELECT "official_documents_page_officialdocumentcollectionofficialdcac4"."id", "official_documents_page_officialdocumentcollectionofficialdcac4"."page_id", "official_documents_page_officialdocumentcollectionofficialdcac4"."official_document_collection_id" 
FROM "official_documents_page_officialdocumentcollectionofficialdcac4" 
INNER JOIN "official_documents_page_officialdocumentpage" ON ("official_documents_page_officialdocumentcollectionofficialdcac4"."page_id" = "official_documents_page_officialdocumentpage"."janisbasepage_ptr_id") 
ORDER BY "official_documents_page_officialdocumentpage"."date" DESC  LIMIT 25 OFFSET 50

We get this:

Limit  (cost=45.84..45.85 rows=25 width=16) (actual time=0.563..0.570 rows=25 loops=1)
  ->  Sort  (cost=45.77..45.91 rows=264 width=16) (actual time=0.547..0.561 rows=150 loops=1)
        Sort Key: official_documents_page_officialdocumentpage.date DESC
        Sort Method: quicksort  Memory: 37kB
        ->  Hash Join  (cost=40.72..43.65 rows=264 width=16) (actual time=0.307..0.449 rows=264 loops=1)
              Hash Cond: (official_documents_page_officialdocumentcollectionofficialdcac4.page_id = official_documents_page_officialdocumentpage.janisbasepage_ptr_id)
              ->  Seq Scan on official_documents_page_officialdocumentcollectionofficialdcac4  (cost=0.00..2.79 rows=264 width=12) (actual time=0.013..0.048 rows=264 loops=1)
              ->  Hash  (cost=39.79..39.79 rows=264 width=8) (actual time=0.283..0.284 rows=264 loops=1)
                    Buckets: 1024  Batches: 1  Memory Usage: 19kB
                    ->  Seq Scan on official_documents_page_officialdocumentpage  (cost=0.00..39.79 rows=264 width=8) (actual time=0.006..0.181 rows=264 loops=1)
"Planning Time: 0.329 ms"
"Execution Time: 0.604 ms"

Now what if we add an index on "-date" to our OfficialDocumentPage model:

    class Meta:
        indexes = [
            models.Index(fields=['-date',]),
        ]

We then get this output from the query explainer:

"Limit  (cost=21.04..30.58 rows=25 width=16) (actual time=0.232..0.293 rows=25 loops=1)"
"  ->  Nested Loop  (cost=0.06..100.78 rows=264 width=16) (actual time=0.022..0.286 rows=80 loops=1)"
"        ->  Index Scan using official_do_date_51dc9d_idx on official_documents_page_officialdocumentpage  (cost=0.03..83.01 rows=264 width=8) (actual time=0.010..0.101 rows=80 loops=1)"
"        ->  Index Scan using official_documents_page_of_page_id_4cf34edb on official_documents_page_officialdocumentcollectionofficialdcac4  (cost=0.03..0.06 rows=1 width=12) (actual time=0.002..0.002 rows=1 loops=80)"
"              Index Cond: (page_id = official_documents_page_officialdocumentpage.janisbasepage_ptr_id)"
"Planning Time: 0.219 ms"
"Execution Time: 0.318 ms"

It helps! We cut down our time for that initial query time by about half. But half a millisecond would not be the reason why this query sometimes takes more than 30 seconds to return.


I then manually hacked together a way to make a composite index on janisbasepage_ptr_id and date. (Note- I removed the "-date" index before I did this.)

from django.db import migrations, models

class Migration(migrations.Migration):

    dependencies = [
        ('official_documents_page', '0008_auto_20200811_2012'),
    ]

    operations = [
        migrations.AddIndex(
            model_name='officialdocumentpage',
            index=models.Index(fields=['janisbasepage_ptr_id', '-date'], name='official_do_id_date_51dc9d_idx'),
        ),
    ]

Here is the explainer output:

"Limit  (cost=60.56..60.62 rows=25 width=16) (actual time=15.214..15.869 rows=25 loops=1)"
"  ->  Sort  (cost=60.37..61.03 rows=264 width=16) (actual time=14.030..14.876 rows=100 loops=1)"
"        Sort Key: official_documents_page_officialdocumentpage.date DESC"
"        Sort Method: top-N heapsort  Memory: 33kB"
"        ->  Hash Join  (cost=44.94..50.28 rows=264 width=16) (actual time=5.076..11.695 rows=264 loops=1)"
"              Hash Cond: (official_documents_page_officialdocumentcollectionofficialdcac4.page_id = official_documents_page_officialdocumentpage.janisbasepage_ptr_id)"
"              ->  Seq Scan on official_documents_page_officialdocumentcollectionofficialdcac4  (cost=0.00..4.64 rows=264 width=12) (actual time=0.015..2.122 rows=264 loops=1)"
"              ->  Hash  (cost=41.64..41.64 rows=264 width=8) (actual time=5.034..5.041 rows=264 loops=1)"
"                    Buckets: 1024  Batches: 1  Memory Usage: 19kB"
"                    ->  Seq Scan on official_documents_page_officialdocumentpage  (cost=0.00..41.64 rows=264 width=8) (actual time=0.011..2.329 rows=264 loops=1)"
"Planning Time: 0.214 ms"
"Execution Time: 16.189 ms"

This makes things much worse. (Note - this is running on my local computer so that calculated times don't totally match up. But a 16ms execution time is a totally different ballpack. It certainly doesn't use the index I made.) Using indexes on inner joins must be a totally different thing.


In conclusion, this issue is not solved, but we've figured out one solution that doesn't work. And we bought ourselves an extra half a millisecond for each query. Which isn't nothing. Pretty close to nothing, but it's something.

chiaberry commented 3 years ago

http://ses4j.github.io/2015/11/23/optimizing-slow-django-rest-framework-performance/