MelbourneGenomics / cpipe

The open source version of the Melbourne Genomics Health Alliance Exome Sequencing Pipeline
Other
33 stars 13 forks source link

Downloading .genome files manually (when machine doesn't have mysql) #8

Open hdashnow opened 9 years ago

hdashnow commented 9 years ago

The install.sh script checks for genome file /vlsci/VR0193/shared/cpipe/hg19/hg19.genome and tries to download it if it doesn't exist. This automatic download is only possible on machines with mysql (so failed on Merri, for example).

What is hg19.genome? And where do I download it manually?

I see from config.groovy that "Genome needed (currently) only if using expanded splice regions"

Yet the install script says: WARNING: Cpipe will not operate correctly unless the /vlsci/VR0193/shared/cpipe/hg19/hg19.genome file is created

Might want to add some clarifying statements to the install script, and the documentation (it's not currently mentioned in the documentation PDF or the hg19 README)

hdashnow commented 9 years ago

So, appearently the .genome file is necessary. When I tried to run the pipeline without it, this command fails:

[hdashnow@merri analysis]$ python /vlsci/VR0193/shared/cpipe/pipeline/scripts/create_exon_bed.py   -c -s ../design/CARDIOM.bed /vlsci/VR0193/shared/cpipe/tools/annovar/humandb/hg19_refGene.txt ../design/CARDIOM.transcripts.txt - | /vlsci/VR0193/shared/cpipe/tools/bedtools/2.18.2/bin/bedtools slop -g /vlsci/VR0193/shared/cpipe/hg19/hg19.genome -b 2 -i - > ../design/CARDIOM.splice.bed;  python /vlsci/VR0193/shared/cpipe/pipeline/scripts/create_exon_bed.py   -c ../design/CARDIOM.bed /vlsci/VR0193/shared/cpipe/tools/annovar/humandb/hg19_refGene.txt ../design/CARDIOM.transcripts.txt ../design/CARDIOM.exons.bed
Error: The requested genome file (/vlsci/VR0193/shared/cpipe/hg19/hg19.genome) could not be opened. Exiting!

...
ssadedin commented 9 years ago

You're right - it used to be not required because we did not use a window for splice variants by default. Now we do, so it's needed. I will update the installer.

On Mon, Apr 27, 2015 at 2:52 PM, Harriet Dashnow notifications@github.com wrote:

So, appearently the .genome file is necessary. When I tried to run the pipeline without it, this command fails:

[hdashnow@merri analysis]$ python /vlsci/VR0193/shared/cpipe/pipeline/scripts/create_exon_bed.py -c -s ../design/CARDIOM.bed /vlsci/VR0193/shared/cpipe/tools/annovar/humandb/hg19_refGene.txt ../design/CARDIOM.transcripts.txt - | /vlsci/VR0193/shared/cpipe/tools/bedtools/2.18.2/bin/bedtools slop -g /vlsci/VR0193/shared/cpipe/hg19/hg19.genome -b 2 -i - > ../design/CARDIOM.splice.bed; python /vlsci/VR0193/shared/cpipe/pipeline/scripts/create_exon_bed.py -c ../design/CARDIOM.bed /vlsci/VR0193/shared/cpipe/tools/annovar/humandb/hg19_refGene.txt ../design/CARDIOM.transcripts.txt ../design/CARDIOM.exons.bed Error: The requested genome file (/vlsci/VR0193/shared/cpipe/hg19/hg19.genome) could not be opened. Exiting!

...

— Reply to this email directly or view it on GitHub https://github.com/MelbourneGenomics/cpipe/issues/8#issuecomment-96503232 .

hdashnow commented 9 years ago

where can I download it?

ssadedin commented 9 years ago

Probably the best method is to download from UCSC using the mysql method if possible though. Will look into whether we can do it without mysql somehow. I will send you a file to use in the interim.

On Mon, Apr 27, 2015 at 3:02 PM, Harriet Dashnow notifications@github.com wrote:

where can I download it?

— Reply to this email directly or view it on GitHub https://github.com/MelbourneGenomics/cpipe/issues/8#issuecomment-96506326 .