ersilia-os / pharmacogx-embeddings

Pharmacogenomics knowledge graph embeddings and related analyses
GNU General Public License v3.0
3 stars 0 forks source link

GWAS Catalog for variant annotation #19

Closed miquelduranfrigola closed 12 months ago

miquelduranfrigola commented 12 months ago

The GWAS Catalog is a relatively complete dataset of GWAS studies.

We could, in principle, download all variants observed and GWAS catalog and gather their traits. This would give us yet annother annotation level for the variants.

Although some traits like "response to rifampicin" are available, those are not generally there and, therefore, it is not worth it to use GWAS Catalog as a source of PGx annotation, generally.

However, the encoding of traits (beyond PGx) as embeddings, eventually, may be highly informative.

miquelduranfrigola commented 12 months ago

While GWAS catalog traits can be obtained through SnpEff, we have decided to directly download them from the GWAS Catalog portal.

We have calculated embeddings from GWAS Catalog using BioGPT. They can be found in data/gwas_ebi/gwas_catalog_biogpt_embeddings.h5.

For now, I consider this issue closed. We can reopen it anytime.