Casecommons / pg_search

pg_search builds ActiveRecord named scopes that take advantage of PostgreSQL’s full text search
http://www.casebook.net
MIT License
1.3k stars 369 forks source link

search by multiple keywords #187

Open eugenekorpan opened 10 years ago

eugenekorpan commented 10 years ago

Hi there, I looked thru documentation but didn't find answer for my question. Here is the case: I have a model with two attributes: job_title and industry. And search functionality. User is able to specify both values. So how can I build search with multiple words? I mean if user entered job_title = 'developer' and industry = 'it' then I need to perform a search with OR operator Select * from jobs where job_title like '%developer%' or industry like '%it%'.. is it possible with pg_search?

Thanks, Eugene

brookzhang commented 10 years ago

I tried using two pg_search scope to deal with this condition but it failed.
The reason is: Each pg_search scope will add a column in select list named "pg_search_rank", a AmbiguousColumn error come out at "order by pg_search_rank".

It's ok to use full-text search with OR operator without pg_search, or you can create a new function such as "pg_search_columns" for your need.

marshall-lee commented 9 years ago

@eugenekorpan: At first i would say that there is no clean way in ActiveRecord to do OR query at all. :smile: What about searching by multiple keywords the question is: what are you trying to implement? Two search fields in form at the same time? Sounds strange.

On the other hand pg_search_scope has :any_word option that tells it to use ||('or'-like operator for tsquery type) instead of && to join tsqueries.

So if we have pg_search_scope :search, against: [:job_title, :industry], using: { tsearch: { any_word: true } } then pg_search('it developer') would split its arguments by ' ' and search "developer" OR "it" in all :against fields (job_title and industry). So if one simply needs 'or'-like text search queries, then this is it.

@brookzhang: What about searching in two pg_scopes simultaneously it may be done by something like subqueries. For example squeel can build such complex queries and has DSL for building OR sql queries.

Consider:

class Post < ActiveRecord::Base
  include PgSearch
  pg_search_scope :title_search, against: :title
  pg_search_scope :content_search, against: :content

  def self.title_or_content_search(title_query, content_query)
    title_relation = title_search(title_query).reorder!.except(:select).select(:id)
    content_relation = content_search(content_query).reorder!.except(:select).select(:id)
    where{ id.in(title_relation) | id.in(content_relation) }
  end
end

Then we have:

>> Post.create(content: "cats")
   (0.3ms)  BEGIN
  SQL (1.0ms)  INSERT INTO "posts" ("content", "created_at", "updated_at") VALUES ($1, $2, $3) RETURNING "id"  [["content", "cats"], ["created_at", "2014-09-16 22:48:06.757039"], ["updated_at", "2014-09-16 22:48:06.757039"]]
   (8.8ms)  COMMIT
┌────┬─────────┬───────┬─────────────────────────┬─────────────────────────┐
│ id │ content │ title │ created_at              │ updated_at              │
├────┼─────────┼───────┼─────────────────────────┼─────────────────────────┤
│ 8  ╎ cats    ╎       ╎ 2014-09-16 22:48:06 UTC ╎ 2014-09-16 22:48:06 UTC │
└────┴─────────┴───────┴─────────────────────────┴─────────────────────────┘
1 row in set
>> Post.create(title: "cyberpunk movie")
   (0.4ms)  BEGIN
  SQL (0.8ms)  INSERT INTO "posts" ("created_at", "title", "updated_at") VALUES ($1, $2, $3) RETURNING "id"  [["created_at", "2014-09-16 22:48:15.003326"], ["title", "cyberpunk movie"], ["updated_at", "2014-09-16 22:48:15.003326"]]
   (8.7ms)  COMMIT
┌────┬─────────┬─────────────────┬─────────────────────────┬─────────────────────────┐
│ id │ content │ title           │ created_at              │ updated_at              │
├────┼─────────┼─────────────────┼─────────────────────────┼─────────────────────────┤
│ 9  ╎         ╎ cyberpunk movie ╎ 2014-09-16 22:48:15 UTC ╎ 2014-09-16 22:48:15 UTC │
└────┴─────────┴─────────────────┴─────────────────────────┴─────────────────────────┘
1 row in set
>> Post.title_or_content_search("movie", "cats")
  Post Load (1.3ms)  SELECT "posts".* FROM "posts"  WHERE (("posts"."id" IN (SELECT "posts"."id" FROM "posts"  WHERE (((to_tsvector('simple', coalesce("posts"."title"::text, ''))) @@ (to_tsquery('simple', ''' ' || 'movie' || ' '''))))) OR "posts"."id" IN (SELECT "posts"."id" FROM "posts"  WHERE (((to_tsvector('simple', coalesce("posts"."content"::text, ''))) @@ (to_tsquery('simple', ''' ' || 'cats' || ' ''')))))))
┌────┬─────────┬─────────────────┬─────────────────────────┬─────────────────────────┐
│ id │ content │ title           │ created_at              │ updated_at              │
├────┼─────────┼─────────────────┼─────────────────────────┼─────────────────────────┤
│ 8  ╎ cats    ╎                 ╎ 2014-09-16 22:48:06 UTC ╎ 2014-09-16 22:48:06 UTC │
│ 9  ╎         ╎ cyberpunk movie ╎ 2014-09-16 22:48:15 UTC ╎ 2014-09-16 22:48:15 UTC │
└────┴─────────┴─────────────────┴─────────────────────────┴─────────────────────────┘
2 rows in set