Casecommons / pg_search

pg_search builds ActiveRecord named scopes that take advantage of PostgreSQL’s full text search
http://www.casebook.net
MIT License
1.32k stars 372 forks source link

with_pg_search_highlight does not obey limit/offset #377

Open haleymt opened 6 years ago

haleymt commented 6 years ago

Sorry in advance if this is already on your radar! I looked through the open issues but didn't see anything.

I was trying to use with_pg_search_highlight recently and was confused by how slow it made my queries. I expected ts_headline to make the search slower, so I paginated my results, but it didn't seem to make a difference one way or another. Eventually I realized that ts_headline was being applied to every result, rather than just the page of them. I ended up writing the SQL myself instead of using pg_search and the performance improved when using ts_headline by almost an order of magnitude.

My pg_search queries looked something like this:

Story.search_text_for("dog").limit(20).offset(0).with_pg_search_highlight

Where search_text_for was a pg_search_scope (I wasn't using multisearch if it makes a difference).

Is there a reason that you apply the highlighting to every row? Or is there a way to paginate that I just wasn't aware of?

Thanks!

nertzy commented 6 years ago

I’d love to figure out a way to only apply the ts_headline to the results returned instead of to the whole table.

I’m guessing that the problem is that we want the ts_headline call to only exist directly in the SELECT expression, outside of the subquery, correct?

haleymt commented 6 years ago

Yep! That's the way my SQL ended up looking. I followed the basic structure of the suggestion in some of the older docs, which was:

SELECT id, ts_headline(body, q), rank
FROM (SELECT id, body, q, ts_rank_cd(ti, q) AS rank
      FROM apod, to_tsquery('stars') q
      WHERE ti @@ q
      ORDER BY rank DESC LIMIT 10) AS foo;

Although in practice it looked more like:

SELECT ts_headline(body, q) AS pg_search_highlight, table.*
FROM table
INNER JOIN (
   SELECT id AS pg_search_id, ts_rank(body, q) AS rank
   FROM table
   WHERE ti @@ q
   ORDER BY rank DESC
   LIMIT 20
   OFFSET 0
) AS pg_search

(That may not be the best way to do it though--I'm a bit rusty on my SQL.)

rustamtolipov commented 6 years ago

I think I fixed it with PR #375