Open afkoeppel opened 8 years ago
Odd.
Can you post the full error message?
This is whole genome paired end sequencing data?
Also, can you post the header of the bam file?
Specifically, do you have the SQ tag?
Well that's what's so odd, there was no error message. Nothing at all was echoed to the screen, and according to the .err file everything went fine, but there was no output (except for the VCF header) and it took no time to run.
samtools view -H SL150028.bwamem.unfilt.rmdup.sort.bam @HD VN:1.3 SO:coordinate @SQ SN:10 LN:112200500 @SQ SN:11 LN:93518069 @SQ SN:12 LN:54450796 @SQ SN:13 LN:118718031 @SQ SN:14 LN:115151701 @SQ SN:15 LN:114627140 @SQ SN:16 LN:90051983 @SQ SN:17 LN:92503511 @SQ SN:18 LN:87229863 @SQ SN:19 LN:72914587 @SQ SN:1 LN:290094216 @SQ SN:20 LN:57791882 @SQ SN:2 LN:285068071 @SQ SN:3 LN:183740530 @SQ SN:4 LN:248343840 @SQ SN:5 LN:177180328 @SQ SN:6 LN:156897508 @SQ SN:7 LN:143501887 @SQ SN:8 LN:132457389 @SQ SN:9 LN:121549591 @SQ SN:MT LN:16313 @SQ SN:X LN:154597545 @RG ID:SL150028 SM:SL150028 @PG ID:bwa PN:bwa CL:bwa mem /home/sdt5z/genomes/igenomes/Rattus_norvegicus/Ensembl/Rnor_5.0/Sequence/BWAIndex/genome.fa SL150028_1.fastq.gz SL150028_2.fastq.gz -t 20 -M -R @RG\tID:SL150028\tSM:SL150028 VN:0.7.12-r1039
can you post the full SL150028.err error file?
There should be something like:
INFO: for Sample:xxx STATS: UV940: mean depth ..............: 39.2523 STATS: UV940: sd depth ................: 10.7403 STATS: UV940: mean insert length: .....: 319.644 STATS: UV940: median insert length ....: 318 STATS: UV940: sd insert length ........: 69.2122 STATS: UV940: lower insert cutoff .....: 208.905 STATS: UV940: upper insert cutoff .....: 492.675 STATS: UV940: average base quality ....: 28.347 STATS: UV940: average mapping quality .: 59.0846 STATS: UV940: number of reads used ....: 100227
What I posted in the original message was the full file. I have only INFO fields, no STATS fields like in your example.
@afkoeppel thanks, I was expecting it to be longer, but maybe not if it is an options parsing error.
original:
"whamg -a $GENOME –f SL150028.bwamem.unfilt.rmdup.sort.bam > SL150028.vcf 2> SL150028.err"
try:
"whamg -a $GENOME -f SL150028.bwamem.unfilt.rmdup.sort.bam > SL150028.vcf 2> SL150028.err"
Note the difference in the dash for -f.
If this solves the problem I'll add better error handling for this case.
Well I'll be. I replaced the dash and now it seems to be running. I'll update as soon as I have output. I'm not even quite sure how I got that non-dash dash, but I think maybe from copy-pasting from the instructions here (I sometimes copy example commands and then just replace the file names with my own).
Shame on me! I'll make sure to check for m vs n dash.
http://www.punctuationmatters.com/hyphen-dash-n-dash-and-m-dash/
The wrong dash was in the README.
That did the trick. I have a vcf with SVs in it now. Thank you so much for your help.
Trying to call SVs on some rat sequences and want to try whamg.
whamg -a $GENOME –f SL150028.bwamem.unfilt.rmdup.sort.bam > SL150028.vcf 2> SL150028.err
Strangely, it seems to be finishing instantaneously (despite >50Gb bam file), and producing no results.
cat SL150028.err INFO: fasta file: /home/sdt5z/genomes/igenomes/Rattus_norvegicus/Ensembl/Rnor_5.0/Sequence/WholeGenomeFasta/genome.fa INFO: Loading discordant reads into forest. INFO: Finished loading reads. INFO: Gathering graphs from forest. INFO: Matching breakpoints. INFO: Printing. INFO: done processing trees INFO: WHAM finished normally, goodbye!
The output VCF file contains the VCF header lines but is otherwise empty.
I also tried with the -c option limiting to just a few chromosomes, but the results were the same. Any advice about what I might be doing wrong would be appreciated.