greenelab / deep-review

A collaboratively written review paper on deep learning, genomics, and precision medicine
https://greenelab.github.io/deep-review/
Other
1.25k stars 271 forks source link

TIDE: predicting translation initiation sites by deep learning #214

Open agitter opened 7 years ago

agitter commented 7 years ago

https://doi.org/10.1101/103374

Translation initiation is a key step in the regulation of gene expression. In addition to the annotated translation initiation sites (TISs), the translation process may also start at multiple alternative TISs (including both AUG and non-AUG codons), which makes it challenging to predict and study the underlying regulatory mechanisms. Meanwhile, the advent of several high-throughput sequencing techniques for profiling initiating ribosomes at single-nucleotide resolution, e.g., GTI-seq and QTI-seq, provides abundant data for systematically studying the general principles of translation initiation and the development of computational method for TIS identification. We have developed a deep learning based framework, named TIDE, for accurately predicting TISs on a genome-wide scale based on QTI-seq data. TIDE extracts the sequence features of translation initiation from the surrounding sequence contexts of TISs using a hybrid neural network and further integrates the prior preference of TIS codon composition into a unified prediction framework. Extensive tests demonstrated that TIDE can greatly outperform the state-of-the-art prediction methods in identifying TISs. In addition, TIDE was able to identify important sequence signatures for individual types of TIS codons, including a Kozak-sequence-like motif for AUG start codon. Furthermore, the TIDE prediction score can be related to the strength of translation initiation in various biological scenarios, including the repressive effect of the upstream open reading frames (uORFs) on gene expression and the mutational effects influencing translation initiation efficiency.

gwaybio commented 7 years ago

very nice paper - clean and convincing. The authors successfully predict translation initiation sites (TIS) based on canonical (AUG) and non-canonical (CUG, GUG, UUG, etc.) start sequences and +/- 100 bp surrounding sequence context. They compare TIDE to alternative TIS prediction algorithms and show that their method outperforms others by a wide margin. Additionally, their code is made public (https://github.com/zhangsaithu/tide_demo), which is awesome! Thanks @zhangsaithu - we'd love to get your feedback on this paper as well.

Why we should include it in the review

I am thinking that I will use this paper in the "study gene expression" section. There appear to be (more recent) efforts to apply deep learning to study gene regulation, which I think fits nicely in that section. (Also see #74.)

Biological Aspects

Predicts TIS in HEK cells using a QTIseq dataset. Briefly, QTIseq profiles translation initiating ribosomes genome-wide. The authors identify interesting alternative site preferences and observe how local sequence context can impact translation efficiency. They confirm their results based on their identification of Kozak sequences as important indicators of translation efficiency.

Computational Aspects

One hot encoded {A,C,U,G} amino acid sequences of length 203 (+/- 100 aa + 3 start codons) fed into CNN with max pooling and dropout followed by an LSTM RNN with a logistic output layer that determines the probability of the sequence being a TIS. They deal with unbalanced classes using a bootstrap sampling method and report performance using ROC and PR curves.

Training and architecture implemented in Keras.

hassanzadeh commented 7 years ago

Here is another paper with a similar architecture: http://ieeexplore.ieee.org/document/7822515/

agitter commented 7 years ago

The similar paper mentioned above, DeeperBind, is #149

hassanzadeh commented 7 years ago

I see,

Thanks.

From: Anthony Gitter [mailto:notifications@github.com] Sent: Friday, March 17, 2017 11:21 AM To: greenelab/deep-review deep-review@noreply.github.com Cc: hassanzadeh ha.hassanzadeh@gmail.com; Comment comment@noreply.github.com Subject: Re: [greenelab/deep-review] TIDE: predicting translation initiation sites by deep learning (#214)

The similar paper mentioned above, DeeperBind, is #149 https://github.com/greenelab/deep-review/issues/149

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/greenelab/deep-review/issues/214#issuecomment-287383809 , or mute the thread https://github.com/notifications/unsubscribe-auth/ANTlnQAQs9mqWIHEOGi_oyeROR2sBEkeks5rmqTWgaJpZM4Lv5zx . https://github.com/notifications/beacon/ANTlnZ1XAXh2cdb_j37OFZD6BUWZE_nuks5rmqTWgaJpZM4Lv5zx.gif