Look into using a relevancy threshold vs. top k for labelling

nasa-petal / PeTaL-labeller

The PeTaL labeler labels journal articles with biomimicry functions.

The Unlicense

6 stars 3 forks source link

Here's MATCH's precision-recall curve on cleaned_lens_output.json:

match_prc

David Smith — 06/30/2021 Is that saying P@1 could be around 80%? Or is it extrapolated Eric Kong — 06/30/2021 It's been hard to optimize threshold (because I'm not sure what to optimize it over) but at threshold = 0.5 for example, we get an average of ~3 labels, P@3 is around 0.5, and R@3 is around 0.37 Eric Kong — 06/30/2021 RE: Is that saying P@1 could be around 80%? At that extreme, I think threshold is really high (0.9999) so it seldom predicts anything (but when it does, those labels are targets 80% of the time)

This is how precision, recall and F1 score vary with threshold (all scores were from 0 to 1, so the threshold sweeps across that range)

match_threshold_vs_precision_recall_and_f1

nasa-petal / PeTaL-labeller

Look into using a relevancy threshold vs. top k for labelling #55