Closed loomlike closed 12 months ago
btw, why draft PR triggers tests? @miguelgfierro @anargyri Can we disable that?
btw, why draft PR triggers tests? @miguelgfierro @anargyri Can we disable that?
This might be a solution: https://github.com/orgs/community/discussions/25722 we need to try it. Issue: #2005
@loomlike are the test failures related to the values of the metrics? I vaguely recall that we had found some bugs in the Spark implementation (or they were not matching the definitions), hence our own implementations for some of the metrics.
@anargyri I think the error was coming from test group names (as I changed some of the tests names, e.g. test_python_precision -> test_python_precision_at_k). I added a new commit to fix them in the test group lists. Let's see if that works.
Regarding the MAP, Spark added meanAveragePrecisionAt
since v3.0.0. So I changed our map_at_k
to be just map
and added map_at_k
which implements "at_k" parts for both python and spark utils.
Regarding the definition of MAP vs. MAP@K, actually what we had before was just MAP based on Wikipedia definition as well as Spark's implementation.
The community seems to follow the following definition:
AP@k is typically calculated for one user from all recommendations by averaging P@1, P@2, P@3, ... P@k
which the Spark's meanAveragePrecisionAt
also follows it.
The codes in this PR also follow that definition.
Good, as long as the evaluation notebooks give the same metrics as before we should be fine.
@loomlike if you get an error on DCO you can go to details and approve it. Only admins can do this. I've approved this one.
@loomlike the issue is fixed https://github.com/recommenders-team/recommenders/pull/2017
Can you please merge the latest staging into your branch before committing?
Description
Refactor both python and spark ranking metric
map
to be the same as Spark's.map_at_k
tomap
which basically the same as Spark'smeanAveragePrecision
map_at_k
to be the same as Spark'smeanAveragePrecisionAt
Refactor evaluation tests to use the same data fixtures between python and spark tests.
Note: With this changes, some notebook tests are failing. I'll need to update the expected values for
map_at_k
metrics for those notebooks, or simply to usemap
that is our previousmap_at_k
.Related Issues
2002
1990
1702
References
Checklist:
staging branch
and not tomain branch
.