Closed sevenseacat closed 5 months ago
Converting to citext makes PostgreSQL ignore any GIN indexes on the field
This is my absolute unscientific testing approach:
artists
CREATE INDEX artists_name_idx ON artists USING GIN (name gin_trgm_ops);
with like, no index being used:
like
iex(2)> value = Ash.CiString.new("gib") iex(3)> Tunez.Music.Artist |> Ash.Query.filter(contains(name, ^value)) |> Ash.read() [debug] QUERY OK source="artists" db=84.4ms queue=0.4ms idle=992.0ms SELECT a0."id", a0."name", a0."biography", a0."previous_names", a0."inserted_at", a0."updated_at", a0."created_by_id", a0."updated_by_id" FROM "artists" AS a0 WHERE ((a0."name"::text)::citext LIKE $1) ["%gib%"] {:ok, [results]}
tunez_dev# explain analyze SELECT a0."id", a0."name", a0."biography", a0."previous_names", a0."inserted_at", a0."updated_at", a0."created_by_id", a0."updated_by_id" FROM "artists" AS a0 WHERE ((a0."name"::text)::citext LIKE '%gib%'); QUERY PLAN ═══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════ Gather (cost=1000.00..19143.33 rows=3902 width=370) (actual time=1.236..85.798 rows=876 loops=1) Workers Planned: 2 Workers Launched: 2 -> Parallel Seq Scan on artists a0 (cost=0.00..17753.12 rows=1626 width=370) (actual time=0.177..81.573 rows=292 loops=3) Filter: ((name)::citext ~~ '%gib%'::citext) Rows Removed by Filter: 259812 Planning Time: 0.091 ms Execution Time: 85.857 ms (8 rows)
with ilike, index being used:
ilike
iex(5)> value = Ash.CiString.new("gib") iex(7)> Tunez.Music.Artist |> Ash.Query.filter(contains(name, ^value)) |> Ash.read() [debug] QUERY OK source="artists" db=8.7ms queue=0.9ms idle=1748.5ms SELECT a0."id", a0."name", a0."biography", a0."previous_names", a0."inserted_at", a0."updated_at", a0."created_by_id", a0."updated_by_id" FROM "artists" AS a0 WHERE (a0."name"::text ILIKE $1) ["%gib%"] {:ok, [results]}
tunez_dev# explain analyze SELECT a0."id", a0."name", a0."biography", a0."previous_names", a0."inserted_at", a0."updated_at", a0."created_by_id", a0."updated_by_id" FROM "artists" AS a0 WHERE (a0."name"::text ILIKE '%gib%'); QUERY PLAN ═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════ Bitmap Heap Scan on artists a0 (cost=20.60..315.92 rows=78 width=370) (actual time=0.323..4.429 rows=876 loops=1) Recheck Cond: (name ~~* '%gib%'::text) Heap Blocks: exact=846 -> Bitmap Index Scan on artists_name_idx (cost=0.00..20.59 rows=78 width=0) (actual time=0.206..0.207 rows=876 loops=1) Index Cond: (name ~~* '%gib%'::text) Planning Time: 2.013 ms Execution Time: 4.571 ms (7 rows)
For other examples, the difference is even more stark - a full search with some aggregates, joins, pagination, etc. goes from 120ms to 1.5ms.
It seems to be worth it to me, but I'm definitely not a SQL expert and there's probably edge cases I haven't thought of!
So SQLite doesn't support ilike apparently, but it does case-insensitive searching by default? I'll investigate a bit more...
Converting to citext makes PostgreSQL ignore any GIN indexes on the field
This is my absolute unscientific testing approach:
artists
table:CREATE INDEX artists_name_idx ON artists USING GIN (name gin_trgm_ops);
artists
table has 780312 rows of dummy data in it.with
like
, no index being used:with
ilike
, index being used:For other examples, the difference is even more stark - a full search with some aggregates, joins, pagination, etc. goes from 120ms to 1.5ms.
It seems to be worth it to me, but I'm definitely not a SQL expert and there's probably edge cases I haven't thought of!