MikkelSchubert / paleomix

Pipelines and tools for the processing of ancient and modern HTS data.
https://paleomix.readthedocs.io/en/stable/
MIT License
43 stars 19 forks source link

[info] conda instructions for paleomix #31

Closed jfy133 closed 4 years ago

jfy133 commented 4 years ago

I don't know if it would be of interest, but as conda is very popular nowadays, I thought to share my notes on how to create a conda environment that can run PALEOMIX (Only tested bam_pipeline so far though).

If you're interested I can make a PR into the docs, otherwise you could just leave this open for people to find.


This assumes you've already installed conda, and set it up to scan the bioconda channel.

Make conda environment; note adding missing GATK and R requirement(s) not listed explicitly in the current PALEOMIX documentation.

conda create -n paleomix python=2.7 pip adapterremoval=2.3.1 samtools=1.9 picard=2.22.9 bowtie2=2.3.5.1 bwa=0.7.17 mapdamage2=2.0.9 gatk=3.8 r-base=3.5.1 r-rcpp=1.0.4.6 r-rcppgsl=0.3.7 r-gam=1.16.1 r-inline=0.3.15

conda activate paleomix

Then while in the paleomix environment, install paleomix

pip install --user paleomix

Now fix the 'difficult' recipes of GATK by download the last GATK v3 version JAR, and putting that and the conda version of picard in the place the paleomix requires.

wget https://storage.googleapis.com/gatk-software/package-archive/gatk/GenomeAnalysisTK-3.8-1-0-gf15c1c3ef.tar.bz2
## not completely necessary, but might as well
gatk3-register GenomeAnalysisTK-3.8-1-0-gf15c1c3ef.tar.bz2

mkdir -p /home/cloud/install/jar_root/
ln -s /<path>/<to>/miniconda2/envs/paleomix/opt/gatk-3.8/GenomeAnalysisTK.jar /home/<user>/install/jar_root/
ln -s /<path>/<to>/miniconda2/envs/paleomix/share/picard-2.22.9-0/picard.jar /home/<user>/install/jar_root/

To finally test it worked properly.

cd ~
paleomix bam_pipeline example .
cd ~/bam_pipeline
paleomix bam_pipeline run 000_makefile.yaml

Once completed, you can disconnect from the PALEOMIX environment with

conda deactivate

I also made a environment file instead to make it slightly easier (you need to remove the .txt suffix before running the command), but obviously this just makes the create command slightly less long, the rest of the setup is still required.

To create you can run:

conda env create -f paleomix_environment.yaml

paleomix_environment.yaml.txt

MikkelSchubert commented 4 years ago

Thank you very much!

I'll take a look at integrating this into the docs when I have time, but I would of course also welcome a PR.