Open mnhqut opened 1 month ago
I don't really understand it neither. It calls get_top_k_items
, which does this:
With k
being top_k
(threshold
). It then calls .head(k)
, but according to pandas docs, k
should be an integer.
The DEFAULT_THRESHOLD
variable is just 10
, an integer. Have you tried using a float? What happens then? Does it work? Isn't this exactly the same as using k
?
I think a user expects by_threshold
to just return just items that have a rating above that threshold.
Yes it is indeed very confusing how 'by_threshold' works.
Doing a quick test, I passed "relevancy_method = "top_k", k =20 ", and got the exact same result as passing "relevancy_method = "by_threshold", threshold = 20, k = whatever ". So the threshold here does not seem to represent a (float) rating value.
As for the type of 'top_k', if I passed 'threshold' or 'k' as a float, it raised no error and worked as if the number got rounded up, for example 2.1 as 3 or 7.6 as 8.
Description
What is the different between k and threshold value here if they all get assigned to top_k? Isn't threshold supposed to be a rating value that the predictions should exceed instead of being the number of items in the top_k list ? Thanks.
Other Comments