Closed mgonzalezporta closed 2 years ago
Hi Mar, here are some of my thoughts on these two samples. SAM3877: this is a rare SV combination so Cyrius does not recognize it. My best guess for the genotype is 10+36/13. There is a fusion duplication on one allele and a fusion deletion on the other, so rare that Cyrius has not seen this before so it makes a no-call. SAM3885: My best guess for the genotype is 10/10+36. One haplotype is a rare form of 10 that Cyrius does not recognize (it lacks one variant that most 10s have).
Thanks Xiao,
FYI, here the calls inferred from additional tools, also inconsistent with each other: | Sample | Cyrius | Aldy | StellarPGx |
---|---|---|---|---|
SAM3877 | No call | 1/36+*10 | 1/36x2+*10 | |
SAM3885 | No call | 10/36+*10 | 10/10x2 |
So noted that a subset of samples will need manual follow up.
Happy for the ticket to be closed.
After seeing the depth issue in the other ticket #18 , I took another look at SAM3877. The problem seems to be similar to SAM3865. The MAD is a bit on the high side, and d67_snp_raw looks off from integer values. This suggests that the D6+D7 CN call may be off (Cyrius called 4 and it could be 5; a CN of 5 makes d67_snp_raw closer to integer values, see two plots below). The genotype should be 1/36+*10 if the total CN is 5, making it consistent with Aldy.
These two samples make me wonder if there is a systematic problem in your samples, e.g. if there is some alignment problem that makes D6/D7 regions lower coverage than other parts of the genome. Are your samples all processed using the same library prep/pipeline? Or is there anything specific about these failing samples? If you plot out Total_CN_raw across a large number of your samples, do you see them falling close to integer values or is there a shift towards lower values? I might be over-thinking but could be good to check.
Hi Xiao,
Still compiling a larger dataset analysed using DRAGEN 3.7. Will re-check trends there and re-open if still relevant.
Thanks
Hi Xiao,
We've come across a few cases where Cyrius reports a no call, in spite of the sample having coverage > 30X. Would you have some recommendations to troubleshoot this further (e.g. not possible to resolve star alleles)?
Attaching a couple of examples: Archive.zip
Thanks