bst-mug / n2c2

Support code for participation at the 2018 n2c2 Shared-Task Track 1
https://n2c2.dbmi.hms.harvard.edu
Apache License 2.0
6 stars 4 forks source link

[SVM] Explore boosting mechanisms #45

Open michelole opened 6 years ago

michelole commented 6 years ago

If any metric different than accuracy is used in the final evaluation (#29), we might suffer on imbalanced classes.

If that holds true, consider methods for fixing that, such as AdaBoost.

https://link.springer.com/content/pdf/10.1007/s10115-009-0198-y.pdf

http://weka.sourceforge.net/doc.dev/weka/classifiers/meta/AdaBoostM1.html

michelole commented 6 years ago

@pievos101 said:

If there is unbalanced classes, and if it does not make too much work, personally I would try random forest as well. Isn't the whole random forest thing the entire boosting idea ?

michelole commented 6 years ago

Consider also

// MET NOT_MET
svm.setWeights("1 1");
michelole commented 6 years ago

Test impact on overall F1 micro score first to see if it's worth. Maybe a fake classifier that cheats using training data and is always right for the imbalanced classes?