sbslee / pypgx

A Python package for pharmacogenomics (PGx) research
https://pypgx.readthedocs.io
MIT License
66 stars 13 forks source link

UGT1A1 Structural variants #136

Closed Jorisvansteenbrugge closed 3 months ago

Jorisvansteenbrugge commented 3 months ago

Hi @sbslee,

First of all, I think you are doing fantastic work with pypgx.

I would be interested in calling structural variants for UGT1A1, as tandem repeats are quite important in phenotyping UGT1A1 (e.g., https://www.pharmgkb.org/haplotype/PA166115842).

Are structural variants for UGT1A1 already on your radar for an upcoming release? Thanks!

sbslee commented 3 months ago

Hi @Jorisvansteenbrugge,

Thanks for your question.

  1. Typical SVs are larger than 1 kb, so I'd say that the tandem repeats you mentioned fall in the category of INDELs, instead of SVs.

  2. The tandem repeat alleles you mentioned (e.g., UGT1A1*28) are already covered by PyPGx:

https://github.com/sbslee/pypgx/blob/7bec840a03b54556feac9fb80351875c3a48d669/pypgx/api/data/allele-table.csv#L1292-L1300

Hope this helps.

Jorisvansteenbrugge commented 3 months ago

That actually makes a lot of sense. I got a little mixed up in the definitions, thank you for clarifying!