martinjzhang / scDRS

Single-cell disease relevance score (scDRS)
https://martinjzhang.github.io/scDRS/
MIT License
98 stars 11 forks source link

Converting GWAS catalog .tsv file to gene set #89

Closed kayihui closed 1 month ago

kayihui commented 1 month ago

Hi scDRS team,

I have the following questions concerning making custom gene set:

I would like to use the dataset downloaded from the GWAS database for a particular disease trait. https://www.ebi.ac.uk/gwas/downloads

The association .tsv file they provided has p-value column and the gene associated with it. it's not like the .sumstats format you provided in the documentation for example: https://www.ebi.ac.uk/gwas/efotraits/MONDO_0005180

What would you suggest to convert the association .tsv file to the gene set format?

Or it is necessary to start with the full summary statistic, which is also available on the website?

Thank you very much. Ka Yi

martinjzhang commented 1 month ago

Hi,

The association .tsv file they provided has p-value column and the gene associated with it. it's not like the .sumstats format you provided in the documentation

Did you mean SNPs instead of genes?

MAGMA would need association statistics across genomewide SNPs

kayihui commented 1 month ago

Hi Martin,

I solved my problem yesterday. I was able to convert the .tsv from the GWAS database to the .gs file.

But I have a different question. My list has 22 genes, and the program stopped.

Computing scDRS score:
trait=Treatment resistant depression: skipped due to small size (n_gene=8, sys_time=2.8s)

What's the minimum number of genes in the list for the program to run?

Thank you.

martinjzhang commented 1 month ago

The minimum number is 10. Is seems out of your 22 genes, only 8 were recognized to overlap with the scRNA-seq data.