Open TimD1 opened 4 years ago
Hi @TimD1,
This is an interesting one and I'm going to have to think about it a bit. I have to admit that the assertion doesn't make sense to me right now but it has been quite awhile since I wrote that code. Even if there are only 2 haplotypes (=combinations) it should be possible to arrange them into sets of 4 (AAAA, AAAB, AABB, ...).
I'll note that somatic calling isn't supported very well, even if you get ploidy 4 working. Nanopolish doesn't have a continuous allele frequency model (for subclonal mutations/cellularity) so I wouldn't expect the results to be very good. @jopineda in my group is working on somatic calling in general but it will be some time before we have results.
Jared
Okay, thanks for responding so quickly! You mention that "somatic calling isn't supported very well" because setting the ploidy to 4 is kind of a hacky fix to allow Nanopolish to call somatic variants. However, Nanopolish was the only nanopore tool flexible enough to even allow me to attempt it. I'm unaware of any other nanopore tools which are designed specifically for this purpose, do you know any off the top of your head? I know that WhatsHap (used by medaka_variant), Longshot, and Clair all specifically target germline mutations in diploid organisms.
As far as I know there isn't a dedicated tool to call somatic mutations from nanopore reads.
I've been trying to call somatic variants in a diploid organism, and so would like to set
--ploidy 4
fornanopolish variants
. Whenever I do this, however, I get the following error:I am calling Nanopolish as follows:
Here is the resulting backtrace:
Looking at the source code, it appears that the constructor for
Combinations
is called withk==ploidy
andN==variant_group.get_num_combinations()
. The assertion which fails isk <= N
. Combinations are added if considered agood_haplotype
, so does this mean there aren't enough candidate variants within my region of interest to determine possible phasings for a 4-ploid organism? I've tried everything from relaxing to restricting the requirements for candidate variants and significantly increasing--max-haplotypes
to one million, but nothing seems to help.