bulik / ldsc

LD Score Regression (LDSC)
GNU General Public License v3.0
635 stars 342 forks source link

ERROR converting summary statistics #121

Open ZheZhang-ZZ opened 6 years ago

ZheZhang-ZZ commented 6 years ago

Hi, When using munge_sumstats.py, I ran into the error below:

Interpreting column names as follows: Allele2: Allele 2, interpreted as non-ref allele for signed sumstat. P: p-Value Allele1: Allele 1, interpreted as ref allele for signed sumstat. Effect: [linear/logistic] regression coefficient (0 --> no effect; above 0 --> A1 is trait/risk increasing) SNP: Variant ID (e.g., rs number)

Reading list of SNPs for allele merge from /home/zz/meta_gwas3/1000G/1000G_phase3_hap/w_hm3.snplist Read 29083171 SNPs for allele merge. Reading sumstats from UC_N45975.txt.gz into memory 5000000 SNPs at a time. .. done Read 8984316 SNPs from --sumstats file. Removed 0 SNPs not in --merge-alleles. Removed 3 SNPs with missing values. Removed 0 SNPs with INFO <= 0.9. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with out-of-bounds p-values. Removed 1330724 variants that were not SNPs or were strand-ambiguous. 7653589 SNPs remain. Removed 0 SNPs with duplicated rs numbers (7653589 SNPs remain). Using N = 45975.0 Median value of Effect was -0.0001, which seems sensible. Removed 91 SNPs whose alleles did not match --merge-alleles (7653502 SNPs remain).

ERROR converting summary statistics:

Traceback (most recent call last): File "/home/zz/software/ldsc/munge_sumstats.py", line 703, in munge_sumstats dat = allele_merge(dat, merge_alleles, log) File "/home/zz/software/ldsc/munge_sumstats.py", line 441, in allele_merge dat.loc[~jj, [i for i in dat.columns if i != 'SNP']] = float('nan') File "/home/zz/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 193, in setitem indexer = self._get_setitem_indexer(key) File "/home/zz/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 171, in _get_setitem_indexer return self._convert_tuple(key, is_setter=True) File "/home/zz/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 242, in _convert_tuple idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter) File "/home/zz/anaconda2/lib/python2.7/site-packages/pandas/core/indexing.py", line 1269, in _convert_to_indexer .format(mask=objarr[mask])) KeyError: '[-1 -1 -1 ... -1 -1 -1] not in index'

Conversion finished at Fri Jun 15 02:04:08 2018

Could help me to figure out how to handle this kind of error?

JianqiaoWang commented 6 years ago

I also have such errors. Don't know why.

File "/Users/w/anaconda3/envs/py27/lib/python2.7/site-packages/pandas/core/indexing.py", line 1327, in _convert_to_indexer .format(mask=objarr[mask]))

KeyError: '[-1 -1 -2 ... -1 -1 -1] not in index'

JianqiaoWang commented 6 years ago

I just notice that this problem has been discussed in issue # 104. It can be solved by reverting pandas version back

rkwalters commented 6 years ago

Hi, That's correct, this is one of the known pandas versioning issues. Glad you were able to resolve this based on the previous discussion in #104. The conda environment discussed in that thread has now been released (see the updated Readme and the newly added environment.yml), so hopefully that will help prevent this from arising in the future. Cheers, Raymond