Code for computational models in "Synthetic Promoter Design in Escherichia coli based on Deep Genera-tive Network".
We report a novel AI-based framework for de novo promoter design in E. coli, which could design brand new synthetic promoters in silico. We combined deep generative model that guides the search, and a prediction model to pre-select the most promising promoters.
From the experimental results, up to 70.8% of the AI-designed promoters were experimentally demonstrated to be functional and shared no significant sequence similarity with E. coli genome. Here, we introduced the code used for promoter sequences generation, then the promoters could be used for experimental tests.
Our computational models could be directly downloaded by:
git clone https://github.com/HaochenW/Deep_promoter.git
Installation has been successfully tested in a Linux platform.
.\seq\sequence_data.txt
python gan_language.py
The CNN model was trained by the dataset from the Thomason et al which contains 14098 promoters with corresponding gene expression level measured by dRNA-seq, and the SVR model was trained by the first round experimental results.
.\seq\predicted_promoters.fa
python predictor.py
seq_exp_SVR.txt
and seq_exp_CNN.txt
Wang Y, Wang H, Liu L, et al. Synthetic Promoter Design in Escherichia coli based on Generative Adversarial Network[J]. BioRxiv, 2019: 563775.
[1] Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V., and Courville, A. C. (2017) Improved training of wasserstein gans, In Advances in Neural Information Processing Systems, pp 5767-5777.
[2] Thomason, M. K., Bischler, T., Eisenbart, S. K., Forstner, K. U., Zhang, A., Herbig, A., Nieselt, K., Sharma, C. M., and Storz, G. (2015) Global transcriptional start site mapping using differential RNA sequencing reveals novel antisense RNAs in Escherichia coli, Journal of bacteriology 197, 18-28.
This project is licensed under the MIT License - see the LICENSE.md file for details