ash-project / ash_sql

A library containing some shared Ecto-based sql data layer functionality.
MIT License
4 stars 8 forks source link

refactor: Use `ilike` for `contains` queries with ci_strings, instead of converting to citext and using `like` #10

Closed sevenseacat closed 5 months ago

sevenseacat commented 5 months ago

Converting to citext makes PostgreSQL ignore any GIN indexes on the field

This is my absolute unscientific testing approach:

with like, no index being used:

iex(2)> value = Ash.CiString.new("gib")
iex(3)> Tunez.Music.Artist |> Ash.Query.filter(contains(name, ^value)) |> Ash.read()
[debug] QUERY OK source="artists" db=84.4ms queue=0.4ms idle=992.0ms
SELECT a0."id", a0."name", a0."biography", a0."previous_names", a0."inserted_at", 
a0."updated_at", a0."created_by_id", a0."updated_by_id" FROM "artists" AS a0 WHERE 
((a0."name"::text)::citext LIKE $1) ["%gib%"]
{:ok, [results]}
tunez_dev# explain analyze SELECT a0."id", a0."name", a0."biography", a0."previous_names", 
a0."inserted_at", a0."updated_at", a0."created_by_id", a0."updated_by_id" FROM "artists" 
AS a0 WHERE ((a0."name"::text)::citext LIKE '%gib%');
                                                          QUERY PLAN
═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
 Gather  (cost=1000.00..19143.33 rows=3902 width=370) (actual time=1.236..85.798 rows=876 loops=1)
   Workers Planned: 2
   Workers Launched: 2
   ->  Parallel Seq Scan on artists a0  (cost=0.00..17753.12 rows=1626 width=370) (actual time=0.177..81.573 rows=292 loops=3)
         Filter: ((name)::citext ~~ '%gib%'::citext)
         Rows Removed by Filter: 259812
 Planning Time: 0.091 ms
 Execution Time: 85.857 ms
(8 rows)

with ilike, index being used:

iex(5)> value = Ash.CiString.new("gib")
iex(7)> Tunez.Music.Artist |> Ash.Query.filter(contains(name, ^value)) |> Ash.read()
[debug] QUERY OK source="artists" db=8.7ms queue=0.9ms idle=1748.5ms
SELECT a0."id", a0."name", a0."biography", a0."previous_names", a0."inserted_at", 
a0."updated_at", a0."created_by_id", a0."updated_by_id" FROM "artists" AS a0 WHERE 
(a0."name"::text ILIKE $1) ["%gib%"]
{:ok, [results]}
tunez_dev# explain analyze SELECT a0."id", a0."name", a0."biography", a0."previous_names", 
a0."inserted_at", a0."updated_at", a0."created_by_id", a0."updated_by_id" FROM "artists" 
AS a0 WHERE (a0."name"::text ILIKE '%gib%');
                                                         QUERY PLAN
═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════
 Bitmap Heap Scan on artists a0  (cost=20.60..315.92 rows=78 width=370) (actual time=0.323..4.429 rows=876 loops=1)
   Recheck Cond: (name ~~* '%gib%'::text)
   Heap Blocks: exact=846
   ->  Bitmap Index Scan on artists_name_idx  (cost=0.00..20.59 rows=78 width=0) (actual time=0.206..0.207 rows=876 loops=1)
         Index Cond: (name ~~* '%gib%'::text)
 Planning Time: 2.013 ms
 Execution Time: 4.571 ms
(7 rows)

For other examples, the difference is even more stark - a full search with some aggregates, joins, pagination, etc. goes from 120ms to 1.5ms.

It seems to be worth it to me, but I'm definitely not a SQL expert and there's probably edge cases I haven't thought of!

sevenseacat commented 5 months ago

So SQLite doesn't support ilike apparently, but it does case-insensitive searching by default? I'll investigate a bit more...