Open fgregg opened 3 years ago
if i want to roll my own scoring https://simonwillison.net/2019/Jan/7/exploring-search-relevance-algorithms-sqlite/
got a spike going here: https://github.com/dedupeio/dedupe/tree/sqlite_index_predicate
this uses fts5 which comes with bm25 as a default scorer. unfortunately, bm25 is not a normalized score, so we can't have threshold defined canopies.
so, we'll need to use a custom scorer. fts4 exposes "matchinfo" which makes it pretty easy to do that (a few examples from peewee).
It's also possible to write customer scorers for fts5, but i couldn't find any third party examples. Here's the bm25 "auxillary function" which could be a prototype.
fts5 matchinfo implementation: https://github.com/sqlite/sqlite/blob/master/ext/fts5/fts5_test_mi.c
https://sqlite.org/fts5.html