JesperGrud / IMAGE

GNU Affero General Public License v3.0
8 stars 1 forks source link

Integrated analysis of motif activity and gene expression changes of transcription factors

Jesper G. S. Madsen1,4, Alexander Rauch1,4, Elvira Laila Van Hauwaert1, Søren Fisker Schmidt1,2, Marc Winnfeld3, Susanne Mandrup1,5

1 Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark.
2 Present address: Institute for Diabetes and Cancer, Helmholtz Center Munich, German Research Center for Environmental Health, Neuherberg, Germany.
3 Research and Development, Beiersdorf AG, Hamburg, Germany.
4 These authors contributed equally.
5 Corresponding author (s.mandrup@bmb.sdu.dk)

Abstract

The ability to predict transcription factors based on sequence information in regulatory elements is a key step in systems-level investigation of transcriptional regulation. Here, we have developed a novel tool, IMAGE, for precise prediction of causal transcription factors based on transcriptome profiling and genome-wide maps of enhancer activity. High precision is obtained by combining a near-complete database of position weight matrices (PWMs), generated by compiling public databases and systematic prediction of PWMs for uncharacterized transcription factors, with a state-of-the-art method for PWM scoring and a novel machine learning strategy, based on both enhancers and promoters, to predict the contribution of motifs to transcriptional activity. We applied IMAGE to published data obtained during 3T3-L1 adipocyte differentiation and showed that IMAGE predicts causal transcriptional regulators of this process with higher confidence than existing methods. Furthermore, we generated genome-wide maps of enhancer activity and transcripts during human mesenchymal stem cell commitment and adipocyte differentiation, and used IMAGE to identify positive and negative transcriptional regulators of this process. Collectively, our results demonstrate that IMAGE is a powerful and precise method for prediction of regulators of gene expression.

Scripts for reproduction of manuscript figures

All figures (except those not prepared using R) in the manuscript can be reproduced from the R scripts within this repository.
To reproduce the figures locally, please clone this repository and download the data files here.
Each script in markdown format (.md) can be opened in RStudio and the code chunk within backticks can be executed.

Main figures

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5

Supplementary figures

Figure S1
Figure S2

Links to relavant sites

IMAGE download
Full-length manuscript
NCBI GEO
Mandrup group website