The REST call to https://api.cpicpgx.org/v1/allele_definition?genesymbol=eq.CYP2C19&select=genesymbol,name,allele_location_value(sequence_location(name,dbsnpid,position))&order=name gives a bit unusual 1 definition: `{"genesymbol":"CYP2C19","name":"1","allele_location_value":[{"sequence_location": {"name": "80161A>G", "dbsnpid": "rs3758581", "position": 94842866}}]}`.
For most genes, *1 is defined as wild type with normal function. Why is CYP2C19 different even though rs3758581 is benign. It is possible that the reference genome contains relatively rare mutation while the vast majority of the population does not (AKA reference bias). This makes the most of the samples seemingly carry a variant with extremely high allele frequency. The star allele definition does not have to be defined with the variant.
Perhaps, this is not an informatics question. It is more of a PGx question.
You're right, it is a PGx question and you can read more about why CYP2C19 *1 is defined as it is with these two resources from PharmVar (who defines CYP2C19 alleles):
The REST call to
https://api.cpicpgx.org/v1/allele_definition?genesymbol=eq.CYP2C19&select=genesymbol,name,allele_location_value(sequence_location(name,dbsnpid,position))&order=name
gives a bit unusual 1 definition: `{"genesymbol":"CYP2C19","name":"1","allele_location_value":[{"sequence_location": {"name": "80161A>G", "dbsnpid": "rs3758581", "position": 94842866}}]}`.For most genes, *1 is defined as wild type with normal function. Why is CYP2C19 different even though rs3758581 is benign. It is possible that the reference genome contains relatively rare mutation while the vast majority of the population does not (AKA reference bias). This makes the most of the samples seemingly carry a variant with extremely high allele frequency. The star allele definition does not have to be defined with the variant.
Perhaps, this is not an informatics question. It is more of a PGx question.