makandra / minidusen

Low-tech search for ActiveRecord with MySQL or PostgreSQL
MIT License
32 stars 5 forks source link

Performance issue for negated queries #19

Open denzelem opened 5 days ago

denzelem commented 5 days ago

Very long lists in PostgreSQL perform terribly in NOT IN. Two suggestions are to use an anti-join or a subquery.

Example for submitting a filter query "!0" on a customer table with approx. ~1,000,000 rows and a runtime > 30 minutes.

SELECT "customers".* FROM "customers" WHERE "customers"."id" NOT IN (
  SELECT "customers"."id"
  FROM "customers"
  WHERE "customers"."first_name" ILIKE '%0%'
     OR "customers"."last_name" ILIKE '%0%'
     OR "customers"."internal_note" ILIKE '%0%'
) LIMIT 10;

We saw this performance issue in an application where the negate operator was submitted by accident (the search field did not display, that we support this syntax). And blocked a few worker for too long time.

Wdyt?

triskweline commented 5 days ago

How would that query look like using the other approaches you're suggesting?

denzelem commented 5 days ago

I think it could look like this. But I was not able to test this for correctness yet.

SELECT "customers".* FROM "customers"
LEFT OUTER JOIN "customers" "excluded_customers" ON (
"customers"."id" = "excluded_customers"."id"
  AND (
    "excluded_customers"."first_name" ILIKE '%0%'
    OR "excluded_customers"."last_name" ILIKE '%0%'
    OR "excluded_customers"."internal_note" ILIKE '%0%'
  )
)
WHERE "excluded_customers"."id" IS NULL;