Closed timhot closed 3 years ago
Auto-WEKA runs involve a certain degree of randomization, so there's a few things that could have happened. First, it's possible that the first run simply got lucky and found a good configuration early on. Second, it might have evaluated on different test data sets -- did you use different data sets in the two runs? The training times on the evaluation data set are very different, so I would think that the data was different.
In this case the first, quick run simply gave you a result that was misleading because it wasn't evaluated on a large and/or representative enough data set.
This may be my lack of understanding .....
I first did a quick (15 min?) test on my data and Auto-WEKA tried 37 configurations, with an accuracy of 93%
I then did a 3 day run, 580 configurations were tried, and the resultant accuracy reported was 67%
This seems very odd to me - perhaps I am missing something crucial?
Full outputs below in case it helps you understand what I'd done.
Thanks for any suggestions that you can give.
Tim ps I am only really interested in the TP and FP rates for the morepork_more-pork class
Quick test
Auto-WEKA result: best classifier: weka.classifiers.trees.RandomForest arguments: [-I, 10, -K, 0, -depth, 0] attribute search: null attribute search arguments: [] attribute evaluation: null attribute evaluation arguments: [] metric: errorRate estimated errorRate: 0.013625789298770355 training time on evaluation dataset: 0.186 seconds
You can use the chosen classifier in your own code as follows:
Classifier classifier = AbstractClassifier.forName("weka.classifiers.trees.RandomForest", new String[]{"-I", "10", "-K", "0", "-depth", "0"}); classifier.buildClassifier(instances);
Correctly Classified Instances 2813 93.4862 % Incorrectly Classified Instances 196 6.5138 % Kappa statistic 0.9066 Mean absolute error 0.0193 Root mean squared error 0.0816 Relative absolute error 27.3209 % Root relative squared error 43.4984 % Total Number of Instances 3009
=== Confusion Matrix ===
1464 7 0 0 3 0 0 12 0 0 0 0 0 1 1 0 0 0 0 0 | a = morepork_more-pork 25 479 1 0 2 0 0 30 0 0 1 0 1 0 0 0 1 1 0 0 | b = unknown 1 1 16 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 | c = siren 3 1 0 27 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 | d = dog 5 1 0 0 68 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 | e = duck 1 1 0 0 0 107 0 1 0 0 0 0 0 0 0 0 0 0 0 0 | f = dove 4 2 0 0 1 0 23 1 0 0 0 0 0 0 0 0 0 0 0 0 | g = human 34 15 0 0 1 1 0 205 0 0 1 0 0 0 1 0 0 0 0 0 | h = bird 0 2 0 0 0 0 0 1 22 0 0 0 0 0 0 0 0 0 0 0 | i = car 0 0 0 0 0 0 0 0 0 23 0 0 0 0 0 0 0 0 0 0 | j = rumble 2 6 0 0 0 2 0 4 0 0 249 0 0 0 0 0 0 0 0 0 | k = white_noise 0 0 0 0 0 0 0 2 0 0 0 8 0 0 0 0 0 0 0 0 | l = cow 1 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 | m = buzzy_insect 0 2 0 0 0 1 0 1 0 0 0 0 0 98 0 0 0 0 0 0 | n = plane 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0 | o = hammering 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 | p = frog 1 1 0 0 0 0 0 2 0 0 0 0 0 0 0 0 8 0 0 0 | q = morepork_more-pork_part 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 | r = chainsaw 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 | s = crackle 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 | t = car_horn
=== Detailed Accuracy By Class ===
Weighted Avg. 0.935 0.030 0.936 0.935 0.934 0.912 0.995 0.977
Temporary run directories: /tmp/autoweka6555860488177895781/
For better performance, try giving Auto-WEKA more time. Tried 37 configurations; to get good results reliably you may need to allow for trying thousands of configurations.
3 day run
Auto-WEKA result: best classifier: weka.classifiers.functions.SMO arguments: [-C, 1.0322930159130057, -N, 0, -K, weka.classifiers.functions.supportVector.RBFKernel -G 0.4733376743447805] attribute search: null attribute search arguments: [] attribute evaluation: null attribute evaluation arguments: [] metric: errorRate estimated errorRate: 0.21635094715852443 training time on evaluation dataset: 2.835 seconds
You can use the chosen classifier in your own code as follows:
Classifier classifier = AbstractClassifier.forName("weka.classifiers.functions.SMO", new String[]{"-C", "1.0322930159130057", "-N", "0", "-K", "weka.classifiers.functions.supportVector.RBFKernel -G 0.4733376743447805"}); classifier.buildClassifier(instances);
Correctly Classified Instances 2024 67.2649 % Incorrectly Classified Instances 985 32.7351 % Kappa statistic 0.4836 Mean absolute error 0.0904 Root mean squared error 0.2094 Relative absolute error 128.0776 % Root relative squared error 111.589 % Total Number of Instances 3009
=== Confusion Matrix ===
1445 42 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | a = morepork_more-pork 178 326 0 0 0 17 0 8 0 0 12 0 0 0 0 0 0 0 0 0 | b = unknown 10 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | c = siren 16 16 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | d = dog 51 17 0 0 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | e = duck 0 45 0 0 0 65 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | f = dove 21 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | g = human 124 97 0 0 0 2 0 32 0 0 3 0 0 0 0 0 0 0 0 0 | h = bird 2 20 0 0 0 2 0 0 0 0 1 0 0 0 0 0 0 0 0 0 | i = car 3 18 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 | j = rumble 17 92 0 0 0 6 0 0 0 0 148 0 0 0 0 0 0 0 0 0 | k = white_noise 2 6 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 | l = cow 1 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | m = buzzy_insect 13 71 0 0 0 3 0 1 0 0 13 0 0 1 0 0 0 0 0 0 | n = plane 2 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | o = hammering 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | p = frog 6 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | q = morepork_more-pork_part 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | r = chainsaw 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | s = crackle 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 | t = car_horn
=== Detailed Accuracy By Class ===
Weighted Avg. 0.673 0.182 ? 0.673 ? ? 0.824 0.551
Temporary run directories: /tmp/autoweka1020608005622997004/
For better performance, try giving Auto-WEKA more time. Tried 580 configurations; to get good results reliably you may need to allow for trying thousands of configurations.