sfu-natlang / lensingwikipedia

Lensing Wikipedia is an interface to visually browse through human history as represented in Wikipedia. This the source code that runs the website:
http://lensingwikipedia.cs.sfu.ca
Other
11 stars 4 forks source link

whoosh does bad stemming #164

Open anoopsarkar opened 9 years ago

anoopsarkar commented 9 years ago

If one does a text search on Nanjing the whoosh search is on nanj presumably due to an overzealous stemming rule.

request at 2015-02-11 09:39:28.251044
handling constraint "4" of type "textsearch"
handling view "7" of type "countbyreferencepoint": generating view
generating field counts for fields: referencePoints
whoosh search results: <Top 315 Results for And([Term('all_freeText', u'nanj')]) runtime=0.0127010345459>
theq629 commented 9 years ago

The default stemmer is just Porter. I believe it is possible to switch out that out, or we could also likely do that at the data preparation phase and store in a separate Whoosh field.