Closed sbonner0 closed 1 year ago
Hey, I (hopefully) fixed the CI pipeline, could you perhaps pull main and rebase your branch onto main + push again? Then your PR should pass the build check.
Hey @kajocina - Thanks! I have done this but now I am failing one of the unit tests for hits@k.
This has raised an issue with using the length of the relevant_items for the division. If k < len(relevant_items), then I am not sure it makes much sense either.
Using the example from the unit tests:
relevant_items = np.array([1, 2, 3, 4, 5, 6])
recommendation = np.array([1, 7, 3, 4, 5, 2, 10, 8, 9, 6])
hits_at_k(relevant_items, recommendation, k=1)
The result here using the length of the relevant_items for the division, the score would be ~ 0.16 as only one of the items can ever be in the first position in the list of recommendations. Not sure what to do about this...
I'll patch the test to the right values, I think the test was passing only with the previous length :) looks like a bug
Thanks @kajocina - still, even if we patch the values I am not sure what the right answer is when k < len(relevant_items).
Even if the k=1 and a relevant_item is first in the list of recommendations.....the returned value would be 1/len(relevant_items). I am not sure if this is correct or not. Maybe we should add to the docstring that the metric is a bit ill-defined when k < len(relevant_items).
I think this is an expected behaviour of this metric, similarly here: https://docs.ampligraph.org/en/2.0.0/generated/ampligraph.evaluation.hits_at_n_score.html. In their case they have a set of four positive triples/items and their rankings:
rankings = np.array([1, 12, 6, 2])
hits_at_n_score(rankings, n=3)
0.5
And correctly they get from n(k)=3 a value 0.5 because from 4 positive triples only two made it on the k=3 ranking (triple 1 and 2, obviously).
The question then is generally if k=1 is a reasonable choice for this metric. We could add info about this as you suggest :)
You can rebase on main again and repeat, I hope this time the tests should pass. I can add the docstring fix :)
Thx, LGTM ;)
Thanks @kajocina !
This PR fixes the division in the hits@k metric to use the length of the relevant_items rather then the number of recommendations. I believe this metric should capture what fraction of the relevant_items are in the first k elements of the recommendation.
Given the example show below, the current code would return a value of ~0.42, but I believe it should be 1 as all of the items appear within the first 7 recommendations.
After the change in this PR, the correct result is returned.