greenelab / deep-review

A collaboratively written review paper on deep learning, genomics, and precision medicine
https://greenelab.github.io/deep-review/
Other
1.25k stars 270 forks source link

deepTarget: End-to-end Learning Framework for microRNA Target Prediction using Deep Recurrent Neural Networks #30

Closed cgreene closed 6 years ago

cgreene commented 8 years ago

http://arxiv.org/abs/1603.09123

agitter commented 7 years ago

@kumardeep27 Thanks. If you quickly scan it and find that it is not relevant for the review, we can also close the issue without a full summary.

kumardeep27 commented 7 years ago

Algorithm is based on Recurrent Neural Networks (RNNs)-based autoencoders for prediction of miRNA targets based on miRNA:mRNA sequence interactions (4,735 positive and 1,225 negative pairs). RNN is used for both feature selection and classification purpose here. Figure 1 shows the F1-measure based performance (0.91) as compared to the 6 existing tools and shows that deepTarget gives 25% enhanced performance. Basically, the method uses the task of "mapping input sequence to an output sequence that may be of unequal length" for sequence modeling with the help of autoencoders and task of "mapping an input sequence to a fixed-sized" for target prediction purpose. The input layer is connected to the first layer of two autoencoders (in parallel) to model miRNA and mRNA sequences, respectively. The second layer is an RNN layer to model the interaction between miRNA and mRNA sequences. The outputs of the top RNN layer are fed into a fully connected output layer, which contains two units for classifying targets and non-targets. Two main steps involve: a) unsupervised learning with 2 autoencoders b) supervised learning of full architechture Dataset Dataset was obtained basically from miRecords and mirBase with 2,042 human miRs with information of binding to cognate mRNAs. Both site-level (507) and gene-level (2,891) datasets were used for positive training dataset. Due to the lack of experimentally verified non-pairs, the negative dataset was created in lines with that of the previous procedures in the literature. Briefly, 507 site-level and 2,122 gene-level miRNA-mRNA pairs were generated with mock miRNAs were generated from mirBase and cognate targets from MiRanda. Features: The more dense vector representation was used for feature generation of RNA sequences which is unlike the existing method of generating sparse encoding. Algorithm uses unsupervised feature learning using RNN based encoder-decoder where each model has 2 RNNs. These features are fed to stacked RNN layers. RNN-based approach bypasses the trivial sequence alignment step to detect the binding sites. In the interaction modeling step, the learned features of both the partners are combined and subjected to various ways to represent between each model where two unsupervised encoders are followed by a stacked RNN layer. Architechture Main architechture is [(4-30-4) || (4-30-4)] followed by (60-30)-2; where (4-30-4) are parallel autoencoders with 3 layers, 4 input units and 30 units in first hidden layer and (60-30)-2 is stacked RNN. Logarithmic loss function was optimized by Adam, weights were initialized according to a uniform distribution and dropout method was used as a regularizer. Performance Performance evaluation include accuracy , sensitivity , specificity, F-measure and positive predictive value (PPV is more useful for imbalance datasets). 10-fold CV with averaged resulting values were reported for the above mentioned metrices and deepTarget outperformed the existing tools (Table 1). 2-layered architecuture performed better than 1-layered and 3-layered. 0.1 dropout probability during training was useful to achieve maximum performance metrices. Also, GRU memory unit architechture outperformed the LSTM memory unit based architechture. In section 4.4 the paper describes the visualization of RNN activations of the hidden layer along with the sequence alignment of miRNA and mRNA. Available at http://data.snu.ac.kr/pub/deepTarget

hjiangcsu commented 6 years ago

when you using miRanda to select the mock miRNA binding site, how did you set the parameters of miRanda?

agitter commented 6 years ago

@hjiangcsu this issue was a discussion of the paper by third parties that did not involve the original authors. I suggest you try contacting the authors listed at http://data.snu.ac.kr/pub/deepTarget