Closed RobinVogel closed 5 years ago
Thanks for getting started on this. The solution looks fine but you need to write some tests to ensure that everything works as intended. You should test that the error is raised when we expect it, but it would also be useful to make sure everything goes through in cases when there is enough points. Generally it is good to test edge cases (e.g., just enough points, one point missing, etc).
Regarding where these tests should be put, I think they could either go to the class Test_RCA
in metric_learn_test.py
(as RCA_Supervised
is currently the only algorithm using chunks), or to a new file test_constraints.py
.
I think the second option is better for two reasons:
chunks
directly, which will cover all other algorithms using chunks if we add any in the futuretest_constraints.py
would be the place where we can add more tests in the future for the methods in constraints.py
(currently I think we only test the other methods indirectly through calls to supervised variants of specific algorithms, which is not ideal)I took some time to get familiar with tests and wrote one for the two untested cases in the chunk generation that you discussed. I tried to model what I wrote on the existing tests, and wrote something relatively general, at the risk of being verbose.
I separated the generation of labels from the tests and wrote it myself, since it needs to satisfy some constraints for edge cases.
I hesitated to group the tests in a class, as in test_sklearn_compat.py
, since test_constraints.py
is supposed to test the class Constraints
.
This looks great !
Fixes #200 See comment https://github.com/scikit-learn-contrib/metric-learn/pull/198#issuecomment-490927459
I count for each set of instances of each label how many chunks it holds, and sum those. If it is lower than the number of requested chunks, I raise an error.