mysociety / alaveteli

Provide a Freedom of Information request system for your jurisdiction
https://alaveteli.org
Other
389 stars 195 forks source link

mixed case search labels fail #1483

Open tmtmtmtm opened 10 years ago

tmtmtmtm commented 10 years ago

A search for Requested_From:liverpool_city_council doesn't work — you need requested_from:liverpool_city_council.

We should coerce that to lower case before further processing. (Or, if there's a reason why that won't work, display a warning in common cases, and/or reprocess searches with zero results).

[Note that this is true of both sides: requested_from:Liverpool_City_Council also doesn't work]

garethrees commented 9 years ago

The stemming (currently) works differently for capitalised words. Don't yet know what options we have to fiddle with this.

dracos commented 9 years ago

I did a search, and surprisingly found me asking a very related question on the Xapian mailing list back in 2008: http://lists.xapian.org/pipermail/xapian-discuss/2008-February/005199.html - that was about the right hand side of the boolean prefix, but I think the same applies to the left hand side. (On stemming, http://xapian.org/docs/sourcedoc/html/classXapian_1_1QueryParser.html#a389713b3969cac6cd98da5fb970f2f8e - TheyWorkForYou and Alaveteli use STEM_SOME. I think this is generally the right value; if someone provides a capitalised word it probably doesn't want to be stemmed for query, but I think that's a separate issue from dealing with boolean prefix case, which is not related to stemming.)