Open alpracka opened 7 years ago
Few possible things, just an amateur myself here:
The hypen behaviour is based on how PG is splitting up the words. PG uses dictionaries to define how to break up the works into tokens or lexemes: https://www.postgresql.org/docs/9.1/static/textsearch-dictionaries.html
By default pg ships with simple
, and possibly based on your locale english
. Slashes and hypens aren't tokenized, and I think in the simple dictionary they are used as word boundaries. Using a custom dictionary looks like kind of a hassle, but will let you tokenize differently.
You might want to switch to trigram, might give you better results on similarity instead of prefix matching.
ping, any idea how to solve it?
I'm using pg-search
to search through paths like /var/foo
and /var/foo-bar
and pg_search doesn't return anything
my code looks like:
class Repository < ApplicationRecord
include PgSearch::Model
belongs_to :category
pg_search_scope :search_name, against: %i[name path], using: :trigram
end
calling Repository.search_name "bci"
generates the following query
Repository Load (2.4ms) SELECT "repositories".* FROM "repositories" INNER JOIN (SELECT "repositories"."id" AS pg_search_id, (ts_rank((to_tsvector('simple', coalesce("repositories"."name"::text, '')) || to_tsvector('simple', coalesce("repositories"."path"::text, ''))), (to_tsquery('simple', ''' ' || 'bci' || ' ''')), 0)) AS rank FROM "repositories" WHERE ('bci' % (coalesce("repositories"."name"::text, '') || ' ' || coalesce("repositories"."path"::text, '')))) AS pg_search_a8038884f8ec5d3389ecea ON "repositories"."id" = pg_search_a8038884f8ec5d3389ecea.pg_search_id ORDER BY pg_search_a8038884f8ec5d3389ecea.rank DESC, "repositories"."id" ASC
(Object doesn't support #inspect)
=>
I for sure have objects to be found in my db:
irb(main):018:0> Repository.all.map { |c| c.name.match?(/^bci\//) }.count
Repository Load (3.9ms) SELECT "repositories".* FROM "repositories"
=> 48
On my db, I have theoretically the necessary extensions:
root@4fb9d81eb9c7:/# apt-get install postgresql-common
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
postgresql-common is already the newest version (246.pgdg110+1).
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.
root@4fb9d81eb9c7:/# exit
Also find this highly annoying, I didnt expect to get this far and be stopped by a hyphen... my database is mostly model numbers
Has anyone managed to solve this? I haven't tried trigram
yet but using a different type of dictionaries don't help.
Any update on this issue? I see this: https://github.com/Casecommons/pg_search/commit/07161f169ddd2508b2116e821faab2610e5dc125
- DISALLOWED_TSQUERY_CHARACTERS = /['?\-\\:]/
+DISALLOWED_TSQUERY_CHARACTERS = /['?\\:]/
So basically they removed the '-' from the DISALLOWED_TSQUERY_CHARACTERS, is there a way to set this as a setting in the project, the DISALLOWED_TSQUERY_CHARACTERS?
Hi, I've found very similar issue #19 and possibly related to #117
I have a simple setup:
And these troubles:
I've tried several options but nothing works as expected. I also tried replacing hyphen/dash with space character, then
Product.search_by_name("DS 7604").any? # => true
butProduct.search_by_name("Hikvision DS 7604NI E1/4P/A").any? # => false
Tested with pg 9.6, sorry if it's not related with the gem but I don't know how pg tsearch work yet so trying first shot here. Thanks for help.