Closed krishnaxamin closed 4 months ago
Hi,
It is possible that there are some unrecognized notations in your CHR column (like chrX or sonething).
Could you try running saige_sumstats.fix_chr()
before saige_sumstats.get_lead()
? I think it may detect and fix those unrecognized notations .
Hi, Thanks for the reply. Unfortunately, the error persists after using fix_chr(). fix_chr() does pick up on 'XY', but as I said the error persists, and the other datasets I am using also have 'XY', with no issues.
Fyi, here are the values within the CHR column of saige_gwaslab
before it is converted to saige_sumstats
.
saige_gwaslab['CHR'].unique()
array(['1', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19',
'2', '20', '21', '22', '3', '4', '5', '6', '7', '8', '9', 'X',
'XY'], dtype=object)
Hi, Thanks for the additional information. XY was not recognized by gwaslab and subsequently converted to NA value. I think in this case the solution for now is to manually convert all XY to X. gwaslab does not distinguish PAR and non-PAR regions for chrX. The reason why this error only occurred for this dataset is that this one contains significant associations in XY while others do not.
Hi,
I see - thank you! That's worked now.
I assume, from fix_chr(), that gwaslab wants numeric chromosome codes in general, e.g. it will want '23' rather than 'X' for chrX. If so, it may be nice to have that noted somewhere in the wiki, just for convenience - apologies if it is already there and I missed it.
Thanks for your help!
Hi,
Great package - so nice being able to do all this in Python.
Code and error detailed below:
I have checked the 'CHR' values - no NANs present. The code works fine with other summary stats I am using which have the same format.
I am using v3.4.40 as set up using the provided yaml file.
Any tips? I can provide the input datafile if needed.
Thanks!