etianen / django-watson

Full-text multi-table search application for Django. Easy to install and use, with good performance.
BSD 3-Clause "New" or "Revised" License
1.2k stars 129 forks source link

add missing index on SearchEntry.object_id, migrate the field to `CharField(191)` #302

Closed syphar closed 1 year ago

syphar commented 1 year ago

This is a follow-up PR for #301 , migrating the field to a CharField(max_length=191), following the pattern in django-reversion.

Also this adds the needed db_index.

Local tests with postgres were fine, let's see if mysql works too :)

syphar commented 1 year ago

cc @etianen

etianen commented 1 year ago

Lovely! I'll make a release

thiagoferreiraw commented 1 year ago

For the record, I'm already seeing good improvements 🎉

Prior to this, I had added an index manually in the TextField and I was getting decent results.

But after updating the index, it got even better!

My results were:

Thank you @etianen and @syphar 🎉 💯

etianen commented 1 year ago

Odd to see a difference between a manual and a migration index. I wonder if that's reproducible.

I vaguely remember something about text classes in the django-created indices for postgres, so I guess it might be that

Or maybe the limit on the column size is helping? Although it seems unlikely.

Would be curious what the query planner says.

On Mon, 16 Jan 2023 at 20:38, Thiago Ferreira @.***> wrote:

For the record, I'm already seeing good improvements 🎉

Prior to this, I had added an index manually in the TextField and I was getting decent results.

But after updating the index, it got even better!

My results were:

  • TextField, manually indexed, 50k records -> 7ms
  • Charfield, index via migrations, 50k records -> less than 1ms

Thank you @etianen https://github.com/etianen and @syphar https://github.com/syphar 🎉 💯

— Reply to this email directly, view it on GitHub https://github.com/etianen/django-watson/pull/302#issuecomment-1384526245, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABEKCFK6BLO5ENB5VUFDGTWSWWAHANCNFSM6AAAAAATVIAZUE . You are receiving this because you were mentioned.Message ID: @.***>

syphar commented 1 year ago

Or maybe the limit on the column size is helping? Although it seems unlikely.

with postgres? definitely unlikely, storage is the same for VARCHAR and TEXT in postgres.

With mysql I can imagine any odd situation / behaviour :)

thiagoferreiraw commented 1 year ago

Hmmm, that's interesting. I thought the change from TextField to Varchar was responsible for the improvement 🤔

thiagoferreiraw commented 1 year ago

Let me get the query plan to see if there is any difference

thiagoferreiraw commented 1 year ago

Yup, something else might've interfered with my previous test. Query plan and timing results are looking the same:

Sort  (cost=170.60..170.63 rows=12 width=248) (actual time=0.082..0.083 rows=1 loops=1)
  Sort Key: entity_partnershipledger.company_name
  Sort Method: quicksort  Memory: 25kB
  ->  Nested Loop  (cost=24.52..170.39 rows=12 width=248) (actual time=0.059..0.060 rows=1 loops=1)
        ->  Bitmap Heap Scan on watson_searchentry  (cost=24.09..69.08 rows=12 width=400) (actual time=0.037..0.037 rows=1 loops=1)
              Recheck Cond: (search_tsv @@ '''olsonmo'':*'::tsquery)
              Filter: (((engine_slug)::text = 'default'::text) AND (content_type_id = 12))
              Heap Blocks: exact=1
              ->  Bitmap Index Scan on watson_searchentry_search_tsv  (cost=0.00..24.09 rows=12 width=0) (actual time=0.031..0.032 rows=1 loops=1)
                    Index Cond: (search_tsv @@ '''olsonmo'':*'::tsquery)
        ->  Index Scan using entity_partnershipledger_pkey on entity_partnershipledger  (cost=0.42..8.44 rows=1 width=248) (actual time=0.018..0.018 rows=1 loops=1)
              Index Cond: (id = (watson_searchentry.object_id)::uuid)
              Filter: ((soft_deleted_at IS NULL) AND (NOT is_project) AND (NOT is_sub_entry) AND (NOT is_archived) AND (company_id = '0c698dea-06bb-4d2e-a197-53645640ba3e'::uuid))
Planning Time: 3.189 ms
Execution Time: 0.166 ms