molgenis / CoNVaDING

Copy Number Variation Detection In Next-generation sequencing Gene panels was designed for small (single-exon) copy number variation (CNV) detection in high coverage next-generation sequencing (NGS) data
GNU Lesser General Public License v3.0
20 stars 22 forks source link

Awk error while running CoNVaDING #28

Open pradyumnasagar opened 7 years ago

pradyumnasagar commented 7 years ago

When I try to run the following command awk shows division by zero attempted error

Command perl ./CoNVaDING.pl -mode StartWithBam -inputDir /san2/exome_cnv/samplebam/ -useSampleAsControl -controlsDir /san2/exome_cnv/controolbam/ -outputDir /san2/exome_cnv/output/ -bed /san2/exome_cnv/designed.bed

Error

####################################### COMMANDLINE OPTIONS IN AFFECT: -mode StartWithBam -inputDir /san2/exome_cnv/samplebam/ -useSampleAsControl -controlsDir /san2/exome_cnv/controolbam/ -outputDir /san2/exome_cnv/output/ -bed /san2/exome_cnv/designed.bed #######################################

Starting analysis Sat May 6 11:18:54 2017 Reading BAM files to process.. Starting counts analysis.. awk: fatal: division by zero attempted awk: fatal: division by zero attempted awk: fatal: division by zero attempted awk: fatal: division by zero attempted awk: fatal: division by zero attempted awk: fatal: division by zero attempted awk: fatal: division by zero attempted awk: fatal: division by zero attempted

ljohansson commented 7 years ago

This errror can appear when regions are present having absolutely no coverage. However, this should not pose any further problems if you are using version 1.2 or higher. In version 1.1.6 in some cases we have seen downstream problems. However, those are unrelated to this specific error message. You can continue the regular workflow.

arcm-radboud commented 6 years ago

This error has not been resolved, even though I am using v1.2.1

Line 2288 in CoNVaDING.pl:

my $extractcov = "samtools depth -r $chr:$start-$stop -a -q 0 -Q 0 $bam | awk \'\{sum+=\$3\} END \{print sum\/NR\}\'";

If there are no reads mapped within a window, the awk command will try to divide by 0 (NR)

ljohansson commented 6 years ago

The awk warnings do not cause any problems for further analysis, unless all of the targets in the bedfile have a coverage of zero. In that case CoNVaDING will crash showing a message looking like:

Uncaught exception from user code:
        Illegal division by zero at /PATH/TO/CoNVaDING-1.2.1/CoNVaDING.pl line 2405.
at /PATH/TO/CoNVaDING-1.2.1/CoNVaDING.pl line 2405
        main::writeCountFile('1\x{9}115252142\x{9}115252336\x{9}NRAS', '1\x{9}115252142\x{9}115252336\x{9}NRAS', 'NRAS', 0, 0, 'HASH(0xe1fab8)', 'HASH(0xe1fae8)', 'HASH(0xe1fa88)') called at /PATH/TO/CoNVaDING-1.2.1/CoNVaDING.pl line 2321
        main::countFromBam('/PATH/TO/BAMS') called at /PATH/TO/CoNVaDING-1.2.1/CoNVaDING.pl line 723
        main::startWithBam('ARRAY(0xe1f758)') called at /PATH/TO/CoNVaDING.pl line 291

With the locations matching the first line of the bed file. If this happens remove the bam file from the project folder and rerun.

wiraki commented 6 years ago

Hi @ljohansson

This is not my experience. The line I mentioned in my previous comment, containing the awk part, is executed with the "foreach $line (@bedfile)". If that target has zero coverage, then awk division with NR raises a division by zero error. And when awk raises that error the whole the process is stopped.

In any case, I suggest this modification to line 2288, which makes more sense and is cleaner: my $extractcov = "samtools depth -r $chr:$start-$stop -a -q 0 -Q 0 $bam | awk \'\{sum+=\$3\} END \{if (NR > 0) print sum\/NR; else print 0;\}\'";

creggian commented 5 years ago

I got the same error when the BAM file had zero coverage in al least one (but not all) regions described in the BED file.

I solved by editing the samtools command by providing an additional "-a" option

my $extractcov = "samtools depth -r $chr:$start-$stop -a -a -q 0 -Q 0 $bam | awk \'\{sum+=\$3\} END \{if (NR > 0) print sum\/NR; else print 0;\}\'";

(see the my edit "-a" after the first "-a")

In this way samtools produces the expected output with 0 coverage in every position.

Versions

freerkvandijk commented 5 years ago

Thanks for adding the fix Claudio. I discussed with Lennart, we will approve the request somewhere in the coming weeks and include it in a next release of the software. We first need to check/test if it affects any downstream analyses and result outcome.

Regards,

Freerk

creggian commented 5 years ago

Hi Freerk, thanks for taking care of that.

Best, Claudio

vlakhujani commented 4 years ago

Hi @creggian and @freerkvandijk

Has this fixed been implemented already ?