Closed Balu2 closed 4 years ago
This is expected. You'll get different models from run-to-run.
You can get slightly more deterministic runs using the AutoML API directly and setting the MLContext
seed. This will cause the train/validate dataset split to be deterministic, though there are still non-deterministic elements like pulling random numbers in a multi-threaded trainer. The non-deterministic behavior will get amplified over time in the model sweeping ensues as a slightly different accuracy will be fed into the SMAC sweeper (Bayesian style) hyper parameter optimizer.
Side note for Model Builder devs: It seems the iteration count is not displaying the model's iteration, but instead showing the same rank order as on the left. See #Iteration on the right side of the output above.
It should be displayed as: (to indicate the order the pipeline was tried)
------------------------------------------------------------------------------------------------------------------
| Trainer Accuracy AUC AUPRC F1-score Duration #Iteration |
|1 SdcaLogisticRegressionBinary 0.8842 0.9690 0.9714 0.8817 2.9 2 |
|2 SdcaLogisticRegressionBinary 0.8817 0.9493 0.9627 0.8932 15.0 11 |
|3 SgdCalibratedBinary 0.8810 0.9582 0.9724 0.8889 2.5 9 |
|4 SgdCalibratedBinary 0.8810 0.9577 0.9720 0.8889 2.4 15 |
|5 AveragedPerceptronBinary 0.8780 0.9540 0.9679 0.8819 3.2 1 |
------------------------------------------------------------------------------------------------------------------
/cc @LittleLittleCloud, @JakeRadMSFT
Thanks Justin for your quick answer. When using AutoML API and deterministic setting, would the accuracy also be better than via Model builder?
@Balu2: When your data is IID, the accuracy should be the same. But if your dataset needs separate train/validate/test datasets, or specific grouping of rows, the API or CLI will give you a better model than Model Builder.
This is most often the case for data which is time dependent, where you'd want newer examples in the test dataset than the training dataset. Or when you'd like another split of the dataset to keep certain groups only in the test set (e.g. train on data from 100 grocery stores, and test on data from another 10 different grocery stores, to have it tell you how well the model generalizes to data from new/unseen grocery stores).
With the API or CLI, you can hand it separate datasets for train/valid/test. And with the API you can also hand it a samplingKeyColumn, which ensures that all rows, containing the same value in the samplingKeyColumn
, are kept together in the same split of the dataset (either in train/validate splits or cross-validation).
More info on leakage: https://en.wikipedia.org/wiki/Leakage_(machine_learning)#Training_example_leakage
@justinormont: the data is not IDD. Via a pivot view, data of different rows is grouped together in one row. I also created 28 training models because of 28 columns in the defined row that can have data or not (null). The complete dataset consists of about 130.000 of these grouped rows, where 128.000 rows are used as dataset and 2.000 rows are used a test set.
I will try the API to see what results it give for the different models and let you know.
Hi. Since your original problem of having different accuracies with ModelBuilder has already been addressed by @justinormont 's responses I will close this issue. Please feel free to reopen it if you're still having problems on that specific regard.
System information
Issue: When I retrain the data with the ML.Net Model Builder with the same data and training parameters, there is a difference in Micro- and Macro-Accuracy.
**What did you do? Using the Model Builder to train a data set
What happened? When I retrain the same data (by pressing the Start training button again) with the same label, features and training time, I see a difference in accuracy. Is there a reason why this is?
What did you expect? I expected more or less the same accuracy because all data and parameters are identical.
Source code / logs
First training:
| Top 5 models explored |
| Trainer MicroAccuracy MacroAccuracy Duration #Iteration | |1 FastTreeOva 0.8575 0.7943 23.3 1 | |2 LightGbmMulti 0.8568 0.8002 3.8 2 | |3 FastTreeOva 0.8538 0.7843 27.0 3 | |4 FastForestOva 0.8513 0.7889 26.5 4 | |5 LightGbmMulti 0.8511 0.7808 6.4 5 |
Second training
| Top 5 models explored |
| Trainer MicroAccuracy MacroAccuracy Duration #Iteration | |1 FastTreeOva 0.9016 0.7241 44.8 1 | |2 FastForestOva 0.8847 0.6465 42.9 2 | |3 AveragedPerceptronOva 0.8575 0.6175 13.3 3 | |4 SymbolicSgdLogisticRegressionOva 0.8387 0.4301 7.1 4 | |5 SdcaMaximumEntropyMulti 0.8331 0.3212 5.1 5 |