Hi Ben, I just wanted to get your thoughts on whether you think the other three metrics that use contig length should be doing what the mean calculation is. Specifically, subtracting what is in the --contig-end-exclusion argument (2x). The rationale is since you aren't mapping to the contig ends, they shouldn't be in the calculations, and some methods like covered_fraction could never reach 100% because you will always come up 150nt short compared to the full length.
This would affect the covered_fraction, reads_per_baseand rpkm methods (with tpm also being affected via rpkm). By subtracting out the --contig-end-exclusion lengths from those calculations, it would bump up the alignment counts with a bigger increase for smaller contigs. Based on some tests with covered_fraction, this does subtly change the rankings of contigs.
Hi Ben, I just wanted to get your thoughts on whether you think the other three metrics that use contig length should be doing what the
mean
calculation is. Specifically, subtracting what is in the--contig-end-exclusion
argument (2x). The rationale is since you aren't mapping to the contig ends, they shouldn't be in the calculations, and some methods likecovered_fraction
could never reach 100% because you will always come up 150nt short compared to the full length.This would affect the
covered_fraction
,reads_per_base
andrpkm
methods (withtpm
also being affected viarpkm
). By subtracting out the--contig-end-exclusion
lengths from those calculations, it would bump up the alignment counts with a bigger increase for smaller contigs. Based on some tests withcovered_fraction
, this does subtly change the rankings of contigs.Let me know what you think.