Closed maciejmotyka closed 4 years ago
The --organism-ploidy
option is silently ignored for the polyclone
calling model. You can however use --max-clones
to set the maximum "ploidy" of the sample.
Filtering is another issue. The polyclone model uses the --filter-expression
filter expression, which defaults to "QUAL < 10 | MQ < 10 | MP < 10 | AF < 0.05 | SB > 0.98 | BQ < 15 | DP < 1"
. The default filter expressions don't change according to the calling model (although different calling models may use different filter expressions). So any ALT alleles with < 0.05 empirical frequency are filtered. You can of course set a different --filter-expression
for threshold filtering, or even use a random forest filter if you have suitable training data.
Thank you for a quick reply. I searched through the code and it starts to make sense now. Please check if my understanding is correct:
The AFB
s are produced by the AF
filter and mean that the allele frequencies are < 0.05.
FILTER=
suggests that it takes into account the --organism-ploidy
, but it doesn't and thus changing that parameter will not affect the filtering.
You're right - the description is misleading. It would probably be better to have a have a new measure (e.g. AFB
) that computes the deviation of the AF
from the expected allele frequency given the ploidy.
I also think it might be useful. Thank you for clarifying everything. Closing.
Request The default value of
--organism-ploidy
is 2 for all callers. However, sinceit would make sense to use a default of 1 for this mode. Or at least include a reminder in the description of the polyclone mode.
Bonus question How does
--organism-ploidy
affect thepolyclone
mode?I expected it to be silently ignored, but I noticed some
AFB
in my VCF, so the filtering still uses it to check the allele frequencies.##FILTER=<ID=AFB,Description="The called allele frequencies are not as expected for the given ploidy">
Are there any other consequences if it's left at the default value?
Version