Closed michaelmhoffman closed 6 years ago
Use histone modifications and/or DNA sequence elements to predict enhancers in ENCODE, FANTOM5, and VISTA data
I am struggling to grasp what the neural network is actually doing in this paper. Confidence scores from each ensemble model is the raw input to the neural network. One could imagine the decision function learned from the confidence score input would be to have neurons that aggregate poor performing classifiers and other that amplify good performing ones. Anything else it could capture?
The authors compare this approach to a MUCH simpler majority voting technique in supplementary figure 1 but do not go beyond this discussion. While an interesting idea, I can image that this would be a pain to implement at test time. Because the network topology expects a certain confidence score to be associated with a given raw input, there has to be careful consideration that the data splits for each ensemble model is exactly the same.
I am not sure if we can classify this paper as deep learning.
Their treatment of cross validation is interesting and the rationale is not described well enough. For each SVM they train on 20% of the data and evaluate performance on the remaining 80%. To me, this sounds like an ensemble of weak learners - which could be good while greatly reducing train time. Instead of inputting confidence scores into NN, could they use weighted majority voting? (weighted by evaluation performance)
http://doi.org/10.1093/nar/gku1058