Closed NTNguyen13 closed 2 years ago
@NTNguyen13,
Good question! The RefAllele
column in the gene-table.csv
file gives you reference STAR allele for the given gene (some people refer it as "wild-type" allele, but reference allele is the preferred term).
For example, the CYP2D6 gene has 1 as reference allele and therefore RefAllele
is 1. Now, if you look at the CYP2D6 sequence of GRCh37, you will find that it actually matches that of 2; therefore, the GRCh37Default
column is 2. Finally, when you do the same for GRCh38, its CYP2D6 sequence matches that of 1 and so GRCh38Default
is 1.
Let me know if you have more questions.
P.S. You will see that the NAT2 gene has 4 as reference allele instead of 1. That's for historical reasons. See the official NAT2 alleles page for more details (http://nat.mbg.duth.gr/Human%20NAT2%20alleles_2013.htm).
Thank you for the quick response! So if I find a new gene to add to pypgx, I can assign RefAllele
based on literature review, GRCh37Default
and GRCh38Default
based on the human genome sequence, depended on assembly versions, am I right?
That's correct! Though I would strongly advise that if you have a PGx gene you'd like to add to PyPGx, please first open a new issue in the repository for discussion before making a PR 😄
yes, I'm definitely gonna follow that!
Hi, this question is mostly for better understanding of pypgx data structure.
I tried to figure out the meaning of RefAllele, but it's not quite right actually. I thought RefAllele is the allele represented on the Human Reference genome (the fasta file), but GRCh37Default and GRCh38Default are already represented that. I also saw case where GRCh37Default and GRCh38Default flip (I think it's because of changes between GRCh37 and GRCh38), but I found 5 cases where GRCh37Default and GRCh38Default are the same, but they are different from RefAllele
I found this logic check to assign allele where no candidate is found, but still, I'm not fully understand the role of RefAllele
Could you please explain what is RefAllele please? And how to assign it in gene-table? Thank you very much.