getian107 / PRScsx

Cross-population polygenic prediction
MIT License
69 stars 20 forks source link

KeyError #48

Closed AnjeGrobler closed 6 months ago

AnjeGrobler commented 7 months ago

Good day, Doctor

I am trying to run a PRS, my code is as follows: python3 PRScsx.py --ref_dir=/scratch/grbanj001/PRScsx/ --bim_prefix=/scratch/grbanj001/PRScsx/Kids --sst_file=/scratch/grbanj001/PRScsx/MDD_EUR4.txt,/scratch/grbanj001/PRScsx/MDD_EAS4.txt --n_gwas=500199,194548 --pop=EUR,EAS --out_dir=/scratch/grbanj001/PRScsx/ --out_name=Kids_MDD --a=1 --b=0.5 --phi=1e-2 --thin=5 --meta=TRUE

But I am presented with this error message: Traceback (most recent call last): File "PRScsx.py", line 154, in main() File "PRScsx.py", line 137, in main sst_dict[pp] = parse_genet.parse_sumstats(ref_dict, vld_dict, param_dict['sst_file'][pp], param_dict['pop'][pp], param_dict['n_gwas'][pp]) File "/scratch/grbanj001/PRScsx/parse_genet.py", line 92, in parse_sumstats set(zip(snp_ref, [mapping[aa] for aa in a1_ref], [mapping[aa] for aa in a2_ref])) | \ File "/scratch/grbanj001/PRScsx/parse_genet.py", line 92, in set(zip(snp_ref, [mapping[aa] for aa in a1_ref], [mapping[aa] for aa in a2_ref])) | \ KeyError: '\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1aC'

I have done some research but cannot seem to find the solution. Any help will be greatly appreciated.

Best, Anje

getian107 commented 7 months ago

Hi Anje -- This error has been reported before and seems to be an issue related to the computing cluster: https://github.com/getian107/PRScsx/issues/47 Maybe you can try running on a different machine?

AnjeGrobler commented 7 months ago

Hi doctor,

my apoogies, I tried researching this issue but did not come across this feed. Thank you for sharing.

Best, Anje

chunyu-yes commented 7 months ago

Hi both,

I encountered the same error and discovered that it is due to one incorrect A2 entry in the reference panel "snpinfo_mult_1kg_hm3". This SNP of this incorrect A2 is exactly on Chromosome 1. The entry displayed "\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1aC" instead of "C". After correcting this, the program ran smoothly. Hope this will help.

AnjeGrobler commented 7 months ago

Hello Doctor,

Thank you very much. That is of great help.

Best, Anje

On 17 Apr 2024, at 05:15, chun-yu @.***> wrote:

Hi both,

I encountered the same error and discovered that it is due to one incorrect A2 entry in the reference panel "snpinfo_mult_1kg_hm3". This SNP of this incorrect A2 is exactly on Chromosome 1. The entry displayed "\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1a\x1aC" instead of "C". After correcting this, the program ran smoothly. Hope this will help.

— Reply to this email directly, view it on GitHub https://github.com/getian107/PRScsx/issues/48#issuecomment-2060269595, or unsubscribe https://github.com/notifications/unsubscribe-auth/BFF4I7COOCG44M3IHF2FQ2DY5XSNVAVCNFSM6AAAAABGFYEAVSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRQGI3DSNJZGU. You are receiving this because you authored the thread.

getian107 commented 7 months ago

Thanks so much for letting me know! It seems that the incorrect A2 entry was introduced recently which is weird as we haven't modified the reference panel for a while but I will double check to make sure that everything looks good.