hamrhein / mouse_embryo

Graphviz graphs using edge data from String-db
2 stars 1 forks source link

Generating Blossom and Transcription Factor STRING diagrams

This repository is to provide reproducible figures in full resolution for the manuscript:

The changing mouse embryo transcriptome at whole tissue and single-cell resolution

Peng He, Brian A. Williams, Georgi K. Marinov, Diane Trout, Henry Amrhein, Libera Berghella, Say-Tar Goh, Ingrid Plajzer-Frick, Veena Afzal, Len A. Pennacchio, Diane E. Dickel, Axel Visel, Bing Ren, Ross C. Hardison, Yu Zhang and Barbara J. Wold

Instructions for running locally

It is assumed that the following pre-requisites are installed on the local host:

*  unzip
*  wget
*  python 3.6 or higher
*  NumPy
*  Pandas
*  Matplotlib
*  PyGraphviz (https://pygraphviz.github.io)
*  imageio (https://imageio.github.io)
*  Microsoft TrueType fonts
  1. Clone or download the repository into a working directory
# Clone using git
https://github.com/hamrhein/mouse_embryo.git

# Download directly
wget https://github.com/hamrhein/mouse_embryo/archive/master.zip
unzip master.zip
rm master.zip

To run the utilities, you'll either need to be in this directory or set your PYTHONPATH environment variable to reference this directory.

  1. Download the Mus musculus annotation from STRING
/usr/bin/wget -q https://stringdb-static.org/download/protein.aliases.v11.0/10090.protein.aliases.v11.0.txt.gz
/usr/bin/wget -q https://stringdb-static.org/download/protein.links.detailed.v11.0/10090.protein.links.detailed.v11.0.txt.gz
/usr/bin/wget -q https://stringdb-static.org/download/protein.actions.v11.0/10090.protein.actions.v11.0.txt.gz
  1. Build the database
python3 util/build_sql_stringdb_database.py 10090.protein.aliases.v11.0.txt.gz 10090.protein.links.detailed.v11.0.txt.gz 10090.protein.actions.v11.0.txt.gz mus_musculus_stringdb_v11.0.db
  1. Remove the downloaded flat files
rm 10090.protein.*.v11.0.txt.gz
  1. Make directories for output files

There are going to be many files produced, so it's best to give them their own folders.

mkdir figure2 figure10
  1. Download the needed data files
wget -O MouseLimbData.h5 https://woldlab.caltech.edu/nextcloud/index.php/s/syNtQbdGessF5NB/download
wget -O peng_bloom.zip https://woldlab.caltech.edu/nextcloud/index.php/s/MQ7DWssYTfmmPnR/download
unzip peng_bloom.zip
rm peng_bloom.zip
  1. Build the Blossom graph for Figure 2
util/build_blossom_graph.py peng_bloom_adjacency_matrix.tsv peng_bloom_cluster_size_table.txt --output_base figure2_blossom_graph --output_dir figure2
  1. Build the TF networks for extended Figure 10
util/build_10x_tf_graphs.py MouseLimbData.h5 mus_musculus_stringdb_v11.0.db --output_dir figure10

Instructions for running with Singularity

  1. Download the Singularity recipe
wget https://github.com/hamrhein/mouse_embryo/raw/master/mouse_embryo_paper.singularity
  1. Build the Singularity container
singularity build mouse_embryo_paper.sif mouse_embryo_paper.singularity
  1. Run the Singularity container
singularity run mouse_embryo_paper.sif

Instructions for running with Docker

  1. Build the Docker container
docker build github.com/hamrhein/mouse_embryo -t mouse_embryo_paper
  1. Run the Docker container

In order for this to work, you'll need to use a bind-mount so that the Docker container can write its output to the host machine's file system.

docker run -v [host directory]:/output -it mouse_embryo_paper

Replace '[host directory]' with the directory where you want the files to go.