Closed gingerwizard closed 2 years ago
This isn't ideal. I think we should index stop words and just give them a score of 0 at query time. This would ensure we match phrases accurately. Although with any stemming we only ever get rough phrase matching anyway.
How much memory overhead would we have by indexing stopwords?
Not a huge amount of memory - since really only an extra few hundred terms in the dict. Postings on disk could be large though - and this would consume memory when loaded at query time
@enzo-inc fixes phrases with stop words
Adds print comments for index loading and speeds up indexing due to stemming by x3 with new lib