zeeev / wham

Structural variant detection and association testing
Other
103 stars 25 forks source link

whamg trouble - finishing instantly with no results #38

Open afkoeppel opened 8 years ago

afkoeppel commented 8 years ago

Trying to call SVs on some rat sequences and want to try whamg.

whamg -a $GENOME –f SL150028.bwamem.unfilt.rmdup.sort.bam > SL150028.vcf 2> SL150028.err

Strangely, it seems to be finishing instantaneously (despite >50Gb bam file), and producing no results.

cat SL150028.err INFO: fasta file: /home/sdt5z/genomes/igenomes/Rattus_norvegicus/Ensembl/Rnor_5.0/Sequence/WholeGenomeFasta/genome.fa INFO: Loading discordant reads into forest. INFO: Finished loading reads. INFO: Gathering graphs from forest. INFO: Matching breakpoints. INFO: Printing. INFO: done processing trees INFO: WHAM finished normally, goodbye!

The output VCF file contains the VCF header lines but is otherwise empty.

I also tried with the -c option limiting to just a few chromosomes, but the results were the same. Any advice about what I might be doing wrong would be appreciated.

zeeev commented 8 years ago

Odd. Can you post the full error message? This is whole genome paired end sequencing data? Also, can you post the header of the bam file?
Specifically, do you have the SQ tag?

afkoeppel commented 8 years ago

Well that's what's so odd, there was no error message. Nothing at all was echoed to the screen, and according to the .err file everything went fine, but there was no output (except for the VCF header) and it took no time to run.

samtools view -H SL150028.bwamem.unfilt.rmdup.sort.bam @HD VN:1.3 SO:coordinate @SQ SN:10 LN:112200500 @SQ SN:11 LN:93518069 @SQ SN:12 LN:54450796 @SQ SN:13 LN:118718031 @SQ SN:14 LN:115151701 @SQ SN:15 LN:114627140 @SQ SN:16 LN:90051983 @SQ SN:17 LN:92503511 @SQ SN:18 LN:87229863 @SQ SN:19 LN:72914587 @SQ SN:1 LN:290094216 @SQ SN:20 LN:57791882 @SQ SN:2 LN:285068071 @SQ SN:3 LN:183740530 @SQ SN:4 LN:248343840 @SQ SN:5 LN:177180328 @SQ SN:6 LN:156897508 @SQ SN:7 LN:143501887 @SQ SN:8 LN:132457389 @SQ SN:9 LN:121549591 @SQ SN:MT LN:16313 @SQ SN:X LN:154597545 @RG ID:SL150028 SM:SL150028 @PG ID:bwa PN:bwa CL:bwa mem /home/sdt5z/genomes/igenomes/Rattus_norvegicus/Ensembl/Rnor_5.0/Sequence/BWAIndex/genome.fa SL150028_1.fastq.gz SL150028_2.fastq.gz -t 20 -M -R @RG\tID:SL150028\tSM:SL150028 VN:0.7.12-r1039

zeeev commented 8 years ago

can you post the full SL150028.err error file?

There should be something like:

INFO: for Sample:xxx STATS: UV940: mean depth ..............: 39.2523 STATS: UV940: sd depth ................: 10.7403 STATS: UV940: mean insert length: .....: 319.644 STATS: UV940: median insert length ....: 318 STATS: UV940: sd insert length ........: 69.2122 STATS: UV940: lower insert cutoff .....: 208.905 STATS: UV940: upper insert cutoff .....: 492.675 STATS: UV940: average base quality ....: 28.347 STATS: UV940: average mapping quality .: 59.0846 STATS: UV940: number of reads used ....: 100227

afkoeppel commented 8 years ago

What I posted in the original message was the full file. I have only INFO fields, no STATS fields like in your example.

zeeev commented 8 years ago

@afkoeppel thanks, I was expecting it to be longer, but maybe not if it is an options parsing error.

original:

"whamg -a $GENOME f SL150028.bwamem.unfilt.rmdup.sort.bam > SL150028.vcf 2> SL150028.err"

try:

"whamg -a $GENOME -f SL150028.bwamem.unfilt.rmdup.sort.bam > SL150028.vcf 2> SL150028.err"

Note the difference in the dash for -f.

If this solves the problem I'll add better error handling for this case.

afkoeppel commented 8 years ago

Well I'll be. I replaced the dash and now it seems to be running. I'll update as soon as I have output. I'm not even quite sure how I got that non-dash dash, but I think maybe from copy-pasting from the instructions here (I sometimes copy example commands and then just replace the file names with my own).

zeeev commented 8 years ago

Shame on me! I'll make sure to check for m vs n dash.

http://www.punctuationmatters.com/hyphen-dash-n-dash-and-m-dash/

The wrong dash was in the README.

afkoeppel commented 8 years ago

That did the trick. I have a vcf with SVs in it now. Thank you so much for your help.