Add basic tests of all pertinent API methods for datasets with 1M total examples in a separate file of unit tests of scale. There could be several large rankings to test including: perfectly correct, perfectly incorrect, all tied, all interleaved, and some pattern leading to a known ROC area. Pertinent methods would probably be those involving aggregates and not individual ranking thresholds, but certain of those thresholds could be checked.
Add basic tests of all pertinent API methods for datasets with 1M total examples in a separate file of unit tests of scale. There could be several large rankings to test including: perfectly correct, perfectly incorrect, all tied, all interleaved, and some pattern leading to a known ROC area. Pertinent methods would probably be those involving aggregates and not individual ranking thresholds, but certain of those thresholds could be checked.