Open agitter opened 6 years ago
Thank you very much for finding the problem! Please refer to: https://github.com/lykaust15/DeepSimulator
I would update the link in Biorxiv as well.
@agitter I can work on adding this to the paper.
@agitter @souravsingh There are two other research papers from our group using deep learning method to solve bioinformatics problems. In case you are interested in, I put the link and the short introduction here.
DEEPre: sequence-based enzyme EC number prediction by deep learning: https://www.ncbi.nlm.nih.gov/pubmed/29069344
MOTIVATION: Annotation of enzyme function has a broad range of applications, such as metagenomics, industrial biotechnology, and diagnosis of enzyme deficiency-caused diseases. However, the time and resource required make it prohibitively expensive to experimentally determine the function of every enzyme. Therefore, computational enzyme function prediction has become increasingly important. In this paper, we develop such an approach, determining the enzyme function by predicting the Enzyme Commission number. RESULTS: We propose an end-to-end feature selection and classification model training approach, as well as an automatic and robust feature dimensionality uniformization method, DEEPre, in the field of enzyme function prediction. Instead of extracting manuallycrafted features from enzyme sequences, our model takes the raw sequence encoding as inputs, extracting convolutional and sequential features from the raw encoding based on the classification result to directly improve the prediction performance. The thorough cross-fold validation experiments conducted on two large-scale datasets show that DEEPre improves the prediction performance over the previous state-of-the-art methods. In addition, our server outperforms five other servers in determining the main class of enzymes on a separate low-homology dataset. Two case studies demonstrate DEEPre's ability to capture the functional difference of enzyme isoforms. AVAILABILITY: The server could be accessed freely at http://www.cbrc.kaust.edu.sa/DEEPre.
Sequence2Vec: a novel embedding approach for modeling transcription factor binding affinity landscape: https://www.ncbi.nlm.nih.gov/pubmed/28961686
MOTIVATION: An accurate characterization of transcription factor (TF)-DNA affinity landscape is crucial to a quantitative understanding of the molecular mechanisms underpinning endogenous gene regulation. While recent advances in biotechnology have brought the opportunity for building binding affinity prediction methods, the accurate characterization of TF-DNA binding affinity landscape still remains a challenging problem. RESULTS: Here we propose a novel sequence embedding approach for modeling the transcription factor binding affinity landscape. Our method represents DNA binding sequences as a hidden Markov model which captures both position specific information and long-range dependency in the sequence. A cornerstone of our method is a novel message passing-like embedding algorithm, called Sequence2Vec, which maps these hidden Markov models into a common nonlinear feature space and uses these embedded features to build a predictive model. Our method is a novel combination of the strength of probabilistic graphical models, feature space embedding and deep learning. We conducted comprehensive experiments on over 90 large-scale TF-DNA datasets which were measured by different high-throughput experimental technologies. Sequence2Vec outperforms alternative machine learning methods as well as the state-of-the-art binding affinity prediction methods. AVAILABILITY AND IMPLEMENTATION: Our program is freely available at https://github.com/ramzan1990/sequence2vec.
@souravsingh we are focusing on addressing the reviews this week so we can finalize a new version of the paper. We will not have time to review and merge new pull requests that aren't directly related to those. There are still several areas in #678 where we need help if you'd like to join in. I'm creating these new issues for future discussion more so than the next version of the manuscript.
Thanks @lykaust15, those also look interesting. We've been creating issues for each individual paper. If you'd like to discuss those, can you please create a new issue for each with the paper title as the issue title?
https://doi.org/10.1101/238683
@lykaust15 is the repository still private? I wasn't able to access it at the link above.