Closed itsmisterbrown closed 1 year ago
Hi,
That all sounds about right, with a few clarifications.
Yes it is per-sample (though the default value is different for contigs vs genomes)
But for your example, min-covered-fraction refers to the % of bases covered by any read, not a mean coverage >0.1. There isn't enough info from your first table to work out what will be filtered when --min-covered-fraction=10
is applied. But in spirit, I think you get it.
Yes RPKM and TPM are treated the same as mean and relative_abundance.
HTH
super, thanks for the clarification!
hi Ben,
I'm sure there's a simple answer here, but I just want to confirm that the behavior I imagine in my head is consistent with the actual behavior. Regarding the
--min-covered-fraction
flag forcontig
andgenome
, this acts on a per-sample basis, correct?eg. does the example below illustrates the correct behavior?
Consider coverage matrix x, with taxa a, b, and c and with
--min-covered-fraction=0
a b c sample 1. 1.2 0.5 0.08 sample 2. 0.09 0.11 0.5 sample 3. 0.05 1.2 7.1but when applying the default of
--min-covered-fraction=10
this would yielda b c sample 1. 1.2 0.5 0 sample 2. 0 0.11 0.5 sample 3. 0 1.2 7.1
where the coverage for taxon c in sample 1 has been set to 0 and the coverage for taxon a in samples 2 and 3 has also been set to 0.
if this is correct, this would also result in anything using the length coverage estimator (eg. RPKM, TPM) to have those values reported as zero also, correct?
thanks very much!