pipeline for genome wide MASSPRF analysis
Note: These instructions are by no means exhaustive and will change based on your individual shell configurations and environment. In general, this instruction (and the pipeline itself) assumes a unix environment running bash. Advanced users w/ different configurations should be able to install without these instructions.
It is important to note that you may need to build both massprf and massprf_preprocess independently. The final executable for massprf will be in ./bin, while the massprf_preprocess executable will be in ./MASSPRF_preprocessing_08July2016.
Get the absolute path of your compiled massprf & MASSPRF_preprocess. These will be something like:
~/PATH/TO/MASSPRF/MASSPRF_10July2016/bin/massprf ~/PATH/TO/MASSPRF/MASSPRF_10July2016/MASSPRF_preprocessing_08July2016/MASSPRF_preprocess
Create a custom symlinks folder in your home directory if you haven't already.
mkdir ~/symbolics
change to that directory:
cd ~/symbolics
Create links to massprf & massprf_preprocess in that directory
ln -s ~/PATH/TO/MASSPRF/MASSPRF_10July2016/bin/massprf massprf
ln -s ~/PATH/TO/MASSPRF/MASSPRF_10July2016/MASSPRF_preprocessing_08July2016/MASSPRF_preprocess massprf_preprocess
Add the symbolic link folder to your $PATH in ~/.bash_profile
vim ~/.bash_profile
Scroll to the line that says something like
PATH=$PATH:OTHERPATHS
Append a colon and add the path to your symbolic links, such that it looks like this:
PATH=$PATH:OTHERPATHS:$HOME/symbolics
pip install conda
If you are working on a cluster, you may need to install miniconda directly from the package due to file permissioning/pip being unavailable. In this case, download the file from https://conda.io/miniconda.html , run it on the cluster, and make sure that the installation directory is added to your path via ~/.bashrc. After installing miniconda, use source ~/.bashrc
to source the path, then type python
and check that your python version is >3.5 and is produced by 'Continuum Analytics'
conda update conda
conda update python
Add Bioconda channel to Conda:
conda config --add channels bioconda
Pandas:
conda install pandas
Biopython:
conda install biopython
Gffutils:
conda install gffutils
pyvcf:
conda install pyvcf
git clone https://github.com/Townsend-Lab-Yale/massprf-pipeline.git
To run example, see jobs.list