meilisearch / meilisearch-rails

Meilisearch integration for Ruby on Rails
https://www.meilisearch.com
MIT License
295 stars 48 forks source link

Issue with indexing when sanitizing attributes #323

Open joshio1 opened 7 months ago

joshio1 commented 7 months ago

Description When I use sanitize attributes, search does not return expected results. For eg. I have:

class Book
   meilisearch sanitize: true do
     attribute :title, :description
  end
end

This is only for book with description which has some HTML in this format:

book = create(:book, description: "<div>Hello</div><div>Hi</div>")

Now, when I search like this:

[3] pry(#<RSpec::ExampleGroups::Library>)> Book.raw_search('Hi')
=> {"hits"=>[],
 "query"=>"Hi",
 "processingTimeMs"=>0,
 "limit"=>20,
 "offset"=>0,
 "estimatedTotalHits"=>0,
 "nbHits"=>0}

Expected behavior I was expecting to get 1 hit for the word Hi.

Current behavior But I get 0 hits for Hi. I get 1 hit if we search Hello instead and also if book is created using this description: <div>Hello</div><div> Hi</div> (i.e. one space before Hi)

Environment (please complete the following information):

ellnix commented 7 months ago

The sanitize_attributes option will completely remove any html tags from attributes before sending it to meilisearch. So while when you search for the book you will see [#<Book id: 1, description: "<div>Hello</div><div>Hi</div>"..., in the meilisearch server it will be saved as {"description": "HelloHi"....

Therefore you are searching for Hi in HelloHi which does not return any results because meilisearch is a prefix based search engine, and does not return results that match within a word.

Here's an issue in the meilisearch server repo where someone else ran across this: https://github.com/meilisearch/meilisearch/issues/3863.

Let me know if that explanation makes sense, I will leave the issue open just in case.