metacpan / metacpan-web

Web interface for MetaCPAN
http://metacpan.org
Other
413 stars 235 forks source link

Searching for "moo" should match "Moo" before "moo". #1887

Open guest20 opened 7 years ago

guest20 commented 7 years ago
screen shot 2017-05-19 at 17 16 16

I believe a search for "moo" should rank Moo (the OO thing) over PerlPowerTools/bin/moo (a number guessing game).

This might very well require tracking which terms were searched for and which dist was clicked through by the user, and then passing that as a boost term in the ES query, though there might be other ways to implement it.

Is this kind of search -> dist info available some place?

jberger commented 7 years ago

Other than having a list of special-cased results, I don't see how we can suggest any other result over an exact word match. We have talked before about using ++ or some other metric to boost results, but even that I can't imagine would generally overrule an exact word match.

guest20 commented 7 years ago

The "special casing" comes from the odds someone will click through to a dist from a particular search term. It requires recording (searched_for, clicked_on) pairs, and giving a dist a bump for each click through it gets.

From what I know of ES, you can provide a boost term in the query, allowing tweaking results based on particular values in the index or query, or even replace the score with inline scripts. The latter is likely a better fit, as it allows the addition of a clicked_for: {term: count} key that can be used to modify the weight.

Combining these 2 things would produce a user-corrected search weighting that puts more clicked results for a term higher in the result.

Where the clicks are stored, and how they're merged into the index is a question I can't really answer without knowing what's happening in the setup that's running now, maybe there's a natural place for it in the dist indexer?

Grinnz commented 7 years ago

Whatever the implementation, boosting results that are more clicked is a good idea IMO. It's how Google works so well.

guest20 commented 7 years ago

Some other examples arer Number::Phone and Mojolicious::Plugin, which each represent the name of a dist and a namespace in which subclasses/plugins live.