greenelab / deep-review

A collaboratively written review paper on deep learning, genomics, and precision medicine
https://greenelab.github.io/deep-review/
Other
1.25k stars 271 forks source link

MiRTDL: a deep learning approach for miRNA target prediction #49

Open agitter opened 8 years ago

agitter commented 8 years ago

http://doi.org/10.1109/TCBB.2015.2510002

gwaybio commented 8 years ago

Biology

The authors are interested in predicting if an miRNA binds and regulates a gene. They generate 20 features based on complementary sequences, binding affinity/accessibility, and conservation scores for miRNA-mRNA pairs found in TargetScanS and TarBase datasets.

Computational Aspects

They implement a CNN with two convolutional layers with mean pooling and a kernel size 3. They use constraint relaxation to overcome class imbalance (in this case, there are more experimentally validated positives than negatives). Their method defines four distinct datasets based on different evidence for each pair and confidence in the miRNA-mRNA regulation where one dataset is negative. The CNN then takes as input the different miRNA-mRNA features with a goal of classifying each input into one of the four datasets. They use an experimentally validated test set to validate performance.

Summary

agitter commented 8 years ago

It's hard to understand their input data from Section 2.4. As @gwaygenomics said, they try resample the features to get 64, 196, 484, or 900 features. Figure 2 and the text suggest that they treat the 196 features as a 2D input (14x14) but when describing Figure 3d they say the features are a 1D array. This potentially makes the application of CNN to unstructured data much worse than #79. In #79 I ultimately think the CNN makes sense given how they constrained the CNN architecture.