Wollongong: predict only high dealy risk

WeiFoo commented 8 years ago

Settings

data : apache, duraspace, jboss, jira, moodle, spring, wso2.
DE: np = 60 (10 times number of variable), life = 5,
Learner: CART and RF
repeats: 5 times
tuning goals: F , Precision, AUC
Results
Tuning goal is F raw results

aaa7eea6-335e-41dd-a430-52bd73416302

Tuning goal is PREC raw results

3f8426b3-5823-4f2f-b83c-37de36da84ae

Tuning goal is AUC raw results

54afc6df-0829-4bdb-80c7-a65159cae0dd

timm commented 8 years ago

[x] the AUCs report some large changes, particular for random forests. are they statistically significant? (Scott Knott!)
[x] what do you mean by predicting only large delay? is this a two class problems?
[x] is the large class rare? does training need smote? can smote be tuned as well as the learner?
[x] is this a cross-val? or is this past data predicting for future data?

WeiFoo commented 8 years ago

the AUCs report some large changes, particular for random forests. are they statistically significant? (Scott Knott!)

I didn't see any large changes in AUC, you can check the raw results here by SK test, F measure #13 Precision #14 AUC #15

what do you mean by predicting only large delay? is this a two class problems?

yes, only two classes here, large delay and non delay. Originally, there're high, medium, low delays and non-delay labels. I use high as positive class, all the others are negative class

is the large class rare? does training need smote? can smote be tuned as well as the learner?

percentage of large class are listed here:

apache: 373/5725=6.5%, duraspace: 99/2078=4.7%, jbos: 597/8281=7.2%, jira: 198/6193=3.2% moodel:61/1313=4.6% spring:91/5202=1.7% wso2: 118/4586=2.5%

Both duraspace and jira get very high(98+) scores when tuning goals are F and precision, separately. But the percentage of large class are not very high. I will try to smote on those data sets and see whether have any improvements. personally, I don't consider tuning smote, based on other results, like text mining results.

is this a cross-val? or is this past data predicting for future data?

no, it's not cross evaluation. the data from Wollongong was already split into training and testing data, I sent email to ask for details about how to generate them.

update: according to Morakot, he can guarantee that for each class, testing data is generated after training data.

ai-se / HPO

Wollongong: predict only high dealy risk #19

Settings

Results