Open alexcritschristoph opened 1 year ago
Thanks Alex, much appreciated for the bug and the kind words.
I think this is really an issue with count, not with non-mean methods, agree?
What would be a good definition that accounts for reads that cross the boundary? Starting position for start of contig and end for end of contig?
Hi Ben,
So, I see this issue with -m covered_bases
and -m covered_fraction
in my data as well, do you see that as well?
I'm actually not sure I follow your second question - my guess would be best for the parameter to be a hard cutoff, so that any read that crosses the boundary (e.g. the 100 bp from the edge by default) at all is not counted.
Hi Ben - big fan / user of coverM here. Recently I uncovered this issue with v0.6.1:
When I run
coverm contig --contig-end-exclusion 1000 --bam-files ./test/*.bam --output-format sparse -o test1.tsv --no-zeros -m mean
vscoverm contig --contig-end-exclusion 0 --bam-files ./test/*.bam --output-format sparse -o test1.tsv --no-zeros -m mean
Different results are obtained consistent with the --contig-end-exclusion parameter working.
But when I run:
coverm contig --contig-end-exclusion 0 --bam-files ./test/*.bam --output-format sparse -o test1.tsv --no-zeros -m count
vs
coverm contig --contig-end-exclusion 1000 --bam-files ./test/*.bam --output-format sparse -o test1.tsv --no-zeros -m count
I get the exact same results, indicating to me that the contig end exclusion parameter is not working. The same is true when
-m
is set tocovered_bases
orcovered_fraction
. I think this is a bug.