Closed brankaj closed 7 years ago
Several methods support multi-task classification including Random forest. Multi-task version of Lasso also exists. These two methods are implemented in this notebook. Methods are evaluated using accuracy and precision scores.
Cool!
Unfortunately, I am not sure does this notebook have enough findings to be a pull request.
Always enough to be a pull request! I'll take a look.
Nice to see the MultiTaskLasso
classifier in use.
All methods seem to fail to recognize true positives.
Your intuition is spot on. It looks like every sample is being classified as a negative. Hence, accuracy is just the proportion of samples that are negative. I'm guessing your balanced accuracy would be 0.5 and kappa would be 0 (i.e. no predictive ability). I think part of the problem could be that you're not trying different alpha
(regularization strength) values. Usually, we use GridSearchCV
to try a range of alpha
values. While the GridSearchCV
will make things slower, I think optimizing alpha is the place to start.
This notebook implements finding accuracy scores for genes that have targeted therapies. Since some agents target several genes, it is interesting to see can building an ensemble model improve results over a model that is built separately for each gene. Several methods support multi-task classification including Random forest. Multi-task version of Lasso also exists. These two methods are implemented in this notebook. Methods are evaluated using accuracy and precision scores.
Unfortunately, I am not sure does this notebook have enough findings to be a pull request. The reasons are as follows: