HaochenW / Deep_promoter

The code for paper "Synthetic Promoter Design in Escherichia coli based on Deep Genera-tive Network"
MIT License
26 stars 13 forks source link

Synthetic Promoter Design in Escherichia coli based on Deep Generative Network

Code for computational models in "Synthetic Promoter Design in Escherichia coli based on Deep Genera-tive Network".

We report a novel AI-based framework for de novo promoter design in E. coli, which could design brand new synthetic promoters in silico. We combined deep generative model that guides the search, and a prediction model to pre-select the most promising promoters.

From the experimental results, up to 70.8% of the AI-designed promoters were experimentally demonstrated to be functional and shared no significant sequence similarity with E. coli genome. Here, we introduced the code used for promoter sequences generation, then the promoters could be used for experimental tests.

Prerequisites

Installation

Our computational models could be directly downloaded by:

git clone https://github.com/HaochenW/Deep_promoter.git

Installation has been successfully tested in a Linux platform.

Using GAN model to generate promoter sequence data

Using Convoluational neural network(CNN)/Suppoer Vector Regression (SVR) model to pre-select high-expression promoter sequences

The CNN model was trained by the dataset from the Thomason et al which contains 14098 promoters with corresponding gene expression level measured by dRNA-seq, and the SVR model was trained by the first round experimental results.

Citation

Wang Y, Wang H, Liu L, et al. Synthetic Promoter Design in Escherichia coli based on Generative Adversarial Network[J]. BioRxiv, 2019: 563775.

References

[1] Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., and Courville, A. C. (2017) Improved training of wasserstein gans, In Advances in Neural Information Processing Systems, pp 5767-5777.

[2] Thomason, M. K., Bischler, T., Eisenbart, S. K., Forstner, K. U., Zhang, A., Herbig, A., Nieselt, K., Sharma, C. M., and Storz, G. (2015) Global transcriptional start site mapping using differential RNA sequencing reveals novel antisense RNAs in Escherichia coli, Journal of bacteriology 197, 18-28.

License

This project is licensed under the MIT License - see the LICENSE.md file for details