Open jjacob7734 opened 8 months ago
@jjacob7734 is making progress on this ticket with a single test test suite (Cassini search).
The best metric I've found to evaluate the quality of a search technology is the Normalized Discounted Cumulative Gain (NDCG). That metric and other common ones are succinctly described here: https://ml-compiled.readthedocs.io/en/latest/metrics.html.
I am using this implementation in the scikit-learn
module: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.ndcg_score.html
Not completed prior to project pause. Moving to icebox
💡 Description
Design and develop a software tool that can assign a score to a set of search results to quantify the quality of the result set. The inputs to the tool are a set of search results and an a labeling of the top 10 documents in the search result set on a scale of
0
to5
where0
meanscompletely irrelevant
and 5 meansextremely highly relevant
. The expectation is that the documents of high relevance will appear at or near the top of the result set for a successful search. Only the top 10 documents need to be considered for evaluation in this iteration.