Closed mgonzalezporta closed 2 years ago
@mgonzalezporta Thanks for the useful suggestion. I will add the 'no call' assignment for such cases.
I see that the computed copy number for WHB10006 sample is 8 and the background alleles are CYP2D610/10. This individual is likely to have one of the more complex 36+10xN or 36xN+10xN alleles, especially since they are part of SG10K. We are still working on enhancements for StellarPGx to be able to call these more complex hybrid tandem rearrangements where CN>4. Given the complexity of CYP2D6 allele combinations in SG10K, I would suggest you use Aldy and Cyrius as well for your CYP2D6 allele calling and generate a consensus call set.
Thanks for the quick response David. We are indeed inspecting the consensus across StellarPGx, Cyrius and Aldy. FYI, for the sample here Cyrius outputs a no call, and aldy reports 36 and 10 with multiple copies (*10+*10+*10+*10+*10+*10+*36.ALDY+*10/*36.ALDY+*10
). Looking forward to the enhancement.
I have similar problems for CYP2D6 on multiple samples. I hope it is related to the initial post.
python3 bin/stellarpgx.py b37/diplo_db_debugged2.dbs V350029497_L02_MANsbdpX078095-736-1_vars/V350029497_L02_MANsbdpX078095-736-1_core_snvs.dip V350029497_L02_MANsbdpX078095-736-1_vars/V350029497_L02_MANsbdpX078095-736-1_full.dip V350029497_L02_MANsbdpX078095-736-1_vars/V350029497_L02_MANsbdpX078095-736-1_gt.dip b37/genotypes4.dbs V350029497_L02_MANsbdpX078095-736-1_gene_del/V350029497_L02_MANsbdpX078095-736-1_gene_del_summary.txt V350029497_L02_MANsbdpX078095-736-1_gene_dup/V350029497_L02_MANsbdpX078095-736-1_gene_dup_summary.txt V350029497_L02_MANsbdpX078095-736-1_cyp2d6_dp b37/haps_var_new.dbs b37/a_scores.dbs
--------------------------------------------
CYP2D6 Star Allele Calling with StellarPGx
--------------------------------------------
Initially computed CN = 15
Sample core variants:
42522613~G>C~0/1;42523943~A>G~0/1;42526763~C>T~0/1
Candidate alleles:
['1.v1_35.v1']
Result:
check
Activity score:
Traceback (most recent call last):
File "bin/stellarpgx.py", line 670, in <module>
ac_score = get_ac_score(act_score, gene_alleles)
File "bin/stellarpgx.py", line 645, in get_ac_score
p_allele = allele_dict[m_allele] + "_" + n_allele
KeyError: 'check'
The crash happens because function dup_test_cn_n runs to the end of the if-chain and returns 'check' which cannot be handled by get_ac_score().
A quick fix was to replace in line stellarogx.py:664
`if gene_alleles == "":
ac_score = "Indeterminate"
print(ac_score)
with
if gene_alleles in ["", "check"]:
ac_score = "Indeterminate"
print(ac_score)
Is this appropriate, or does gene_alleles == 'check' demand a more informative output?
@mgonzalezporta – Thanks again for opening this issue. I have addressed the error and 36xN+10xN calling with 3ab19a9
@pmginacio – Thanks for the suggestion at stellarpgx.py:664. We had included the 'check' output to flag any samples that require visual/ in-depth inspection. For instance, the V350029497_L02_MANsbdpX078095-736-1 sample you are analysing seems to have CN=15 which is rare. May you try running it again with the new commit. Thanks.
@mgonzalezporta and @pmginacio – Thanks again for pointing out these bugs. I am closing this issue – feel free to reopen it in case of anything else related to this.
Describe the bug The
call_stars
process for CYP2D6 exits with error, resulting in a truncatedcyp2d6.alleles
file. Here the error log:The trigger seems to be the fact that there are no het variants genotyped for this sample, which makes
test_list2
an empty list.To Reproduce Here the command used to launch StellarPGx for the conflicting sample:
Expected behavior Our suggestion would be to issue a no call, as opposed to exiting with an error code.