Closed spilliton closed 12 years ago
Hi there.
Knocked up a Rails app to test this, and I get vaguely similar results, which as you say aren't what you would expect.
I'll look at this in more detail when I have more time.
For details on the relevancy calculation, have a look at the following: https://github.com/dougal/acts_as_indexed/blob/master/lib/acts_as_indexed/search_atom.rb#L12
The linked perlmonks algorithm should help.
Thanks for reporting.
Thanks for the speedy response!
Looking at the algorithm being used ( http://www.perlmonks.com/index.pl?node_id=27509 ), it states:
...basically what is implied by the above formula is that the weight given to term in respect to a document is higher if:
it occurs many times in that document
it doesn't appear that often in other documents in the collection
So in the output I'm seeing, the word 'miles' occurs equally in each result (1 time). So in the inverted index I think they would all have the same document score for 'miles' and thus have no defined order in relation to each other when being looked up by only the word 'miles'.
I'm thinking this issue is only noticeable when indexing on only short strings. I'd imagine it works great when indexing larger bodies of text where words are more likely to have repeats in the same record.
In my current project I do a bit of string comparing using 'amatch' and it's Levenshtein implementation which compares two strings and outputs a 0..1 value of how similar the two are. Maybe after getting the initial ordered results, we can then order results with the same score by comparing to the original search term(s).
I'll fork and experiment :)
I have records where I only care to index their title. So I added:
acts_as_indexed :fields => [:name]
To the model (Artist) and called:
Artist.send(:build_index)
I then tried a test search:
Artist.find_with_index('Miles').each{|a| puts a.name }
Here are the first few lines output:
Seeing as I have an artist with the name "Miles" exactly, I would think that would be deemed the most relevant. Do I have something misconfigured maybe?