Open glebuk opened 5 years ago
Some of these it might make sense to have baseline tests (in the sense of tests from BaseTestBaseline
that utilize CheckEquality
and suchlike), but one thing to keep in mind is that the reason why we had so many baseline tests is that previously we were writing a command line tool -- so, it made sense that the baselines would be this comparison of outputs.
Now that we are writing an API, merely comparing console output may not be appropriate. And, in some cases, the public surface of the API has changed rather entirely -- from the point of view of the public API, to give one example, there is no such thing as an evaluator per se (in the sense of something implementing IEvaluator
).
Given this change in focus to be more a .NET API, it's unclear to me that our testing story would remain the same as it did when this code was 99.9% an internal command line tool.
@TomFinley I agree, instead of enabling baseline tests we should rewrite them using the API. It will not only do the testing but also increase the sample size for this toolkit. Though rewriting the tests using APIs might be more time consuming but surely the right thing to do.
Issue
Currently there are many components in ML.NET that do not have any kind of baseline tests. We run risk of regressions as we don't have tests for some components that can detect performance degradation.
Required work
Here are the following identified baselines that are completely missing, yet exist in prior internal versions. The task is to port them to ML.NET:
BaselineNormalize
Anomaly
Evaluators
FastTreeRanking
FastTreeRegression
FastTreeTweedieRegression
ImageTests
KM
LDSVM
LightGBMRank
ModelExport
MultiClassNaiveBayes
OGD
PoissonRegression
RegressionGamTrainer
ResultProcessor
SDCAMC
SDCAR