getian107 / PRScsx

Cross-population polygenic prediction
MIT License
65 stars 20 forks source link

How to get the avaliable snp list used in auto-prs module #42

Closed koujiaodahan closed 7 months ago

koujiaodahan commented 8 months ago

Hi, there is a summary statistics from the EAS population(No validation data, No bim, No genotype data). I want to run PRScs-auto to get the PRS module, especially get the snplist in the best prs module. how can I achieve it?

getian107 commented 8 months ago

I'm not sure I understand the question. Are you saying that you want to run PRS-CS but do not have a target dataset in mind for now? In that case you can build a fake bim file including all the SNPs in the summary stats or in the reference panel such that you retain the maximum number of SNPs in the posterior file for future use.

koujiaodahan commented 8 months ago

I'm not sure I understand the question. Are you saying that you want to run PRS-CS but do not have a target dataset in mind for now? In that case you can build a fake bim file including all the SNPs in the summary stats or in the reference panel such that you retain the maximum number of SNPs in the posterior file for future use.

Well, yes, i have built a fake bim for later overlapping

koujiaodahan commented 7 months ago

I met a bug when running the test example (UnboundLocalError: local variable 'ref_dict' referenced before assignment). By the way, if i have a single population(eas) gwas, should i run the PRScs instead of PRScsx (because PRScsx is designed for cross-population)?

I met a bug when running the test example (UnboundLocalError: local variable 'ref_dict' referenced before assignment). By the way, if i have a single population(eas) gwas, should i run the PRScs instead of PRScsx (because PRScsx is designed for cross-population)?

getian107 commented 7 months ago

The 'ref_dict' error occurs usually because the reference directory was not correctly specified. If you keep having this issue please send a snapshot of the reference folder as well as the command line you used to call PRS-CSx.

If you only have GWAS from a single population you can use PRS-CS although PRS-CSx is equivalent to PRS-CS in that scenario.