I am getting the error bellow. Tickets mention downgrading the version of pandas.
When I run the above command without: --merge-alleles w_hm3.snplist I don't get any error.
My question is what is the purpose of --merge-alleles w_hm3.snplist?
Is the purpose of that flag to extract from my UKB_DR only SNPs present in w_hm3.snplist and to have for A1 and A2 values present in w_hm3.snplist? If yes I can do that in R without messing with pandas version.
Interpreting column names as follows:
A1: Allele 1, interpreted as ref allele for signed sumstat.
P: p-Value
Z: Z-score (0 --> no effect; above 0 --> A1 is trait/risk increasing)
A2: Allele 2, interpreted as non-ref allele for signed sumstat.
SNP: Variant ID (e.g., rs number)
Reading list of SNPs for allele merge from w_hm3.snplist
Read 1217311 SNPs for allele merge.
Reading sumstats from UKB.GWAS.txt into memory 5000000 SNPs at a time.
. done
Read 3859763 SNPs from --sumstats file.
Removed 3329298 SNPs not in --merge-alleles.
Removed 0 SNPs with missing values.
Removed 0 SNPs with INFO <= 0.9.
Removed 0 SNPs with MAF <= 0.01.
Removed 0 SNPs with out-of-bounds p-values.
Removed 105 variants that were not SNPs or were strand-ambiguous.
530360 SNPs remain.
Removed 30 SNPs with duplicated rs numbers (530330 SNPs remain).
Using N = 336974.0
Median value of Z was -0.00381962, which seems sensible.
Removed 114 SNPs whose alleles did not match --merge-alleles (530216 SNPs remain).
ERROR converting summary statistics:
Traceback (most recent call last):
File "./munge_sumstats.py", line 707, in munge_sumstats
dat = allele_merge(dat, merge_alleles, log)
File "./munge_sumstats.py", line 445, in allele_merge
dat.loc[~jj, [i for i in dat.columns if i != 'SNP']] = float('nan')
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 189, in setitem
indexer = self._get_setitem_indexer(key)
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 167, in _get_setitem_indexer
return self._convert_tuple(key, is_setter=True)
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 248, in _convert_tuple
idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter)
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 1354, in _convert_to_indexer
return self._get_listlike_indexer(obj, axis, **kwargs)[1]
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 1161, in _get_listlike_indexer
raise_missing=raise_missing)
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 1246, in _validate_read_indexer
key=key, axis=self.obj._get_axis_name(axis)))
KeyError: u"None of [Int64Index([-1, -1, -1, -1, -1, -1, -1, -1, -1, -1,\n ...\n -1, -1, -1, -1, -1, -1, -1, -1, -1, -1], dtype='int64', length=1217311)] are in the [index]"
Conversion finished at Thu Nov 19 13:11:12 2020
Total time elapsed: 18.13s
Traceback (most recent call last):
File "./munge_sumstats.py", line 746, in
munge_sumstats(parser.parse_args(), p=True)
File "./munge_sumstats.py", line 707, in munge_sumstats
dat = allele_merge(dat, merge_alleles, log)
File "./munge_sumstats.py", line 445, in allele_merge
dat.loc[~jj, [i for i in dat.columns if i != 'SNP']] = float('nan')
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 189, in setitem
indexer = self._get_setitem_indexer(key)
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 167, in _get_setitem_indexer
return self._convert_tuple(key, is_setter=True)
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 248, in _convert_tuple
idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter)
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 1354, in _convert_to_indexer
return self._get_listlike_indexer(obj, axis, **kwargs)[1]
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 1161, in _get_listlike_indexer
raise_missing=raise_missing)
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 1246, in _validate_read_indexer
key=key, axis=self.obj._get_axis_name(axis)))
KeyError: u"None of [Int64Index([-1, -1, -1, -1, -1, -1, -1, -1, -1, -1,\n ...\n -1, -1, -1, -1, -1, -1, -1, -1, -1, -1], dtype='int64', length=1217311)] are in the [index]"
Hello,
I saw a few tickets reporting a similar issue when using:
./munge_sumstats.py \ --sumstats UKB.GWAS.txt \ --N 336974 \ --ignore BETA \ --out UKB_DR \ --merge-alleles w_hm3.snplist
I am getting the error bellow. Tickets mention downgrading the version of pandas. When I run the above command without: --merge-alleles w_hm3.snplist I don't get any error.
My question is what is the purpose of --merge-alleles w_hm3.snplist?
Is the purpose of that flag to extract from my UKB_DR only SNPs present in w_hm3.snplist and to have for A1 and A2 values present in w_hm3.snplist? If yes I can do that in R without messing with pandas version.
Thanks Ana
My Error:
Call: ./munge_sumstats.py \ --out UKB_DR \ --merge-alleles w_hm3.snplist \ --N 336974.0 \ --sumstats UKB.GWAS.txt \ --ignore BETA
Interpreting column names as follows: A1: Allele 1, interpreted as ref allele for signed sumstat. P: p-Value Z: Z-score (0 --> no effect; above 0 --> A1 is trait/risk increasing) A2: Allele 2, interpreted as non-ref allele for signed sumstat. SNP: Variant ID (e.g., rs number)
Reading list of SNPs for allele merge from w_hm3.snplist Read 1217311 SNPs for allele merge. Reading sumstats from UKB.GWAS.txt into memory 5000000 SNPs at a time. . done Read 3859763 SNPs from --sumstats file. Removed 3329298 SNPs not in --merge-alleles. Removed 0 SNPs with missing values. Removed 0 SNPs with INFO <= 0.9. Removed 0 SNPs with MAF <= 0.01. Removed 0 SNPs with out-of-bounds p-values. Removed 105 variants that were not SNPs or were strand-ambiguous. 530360 SNPs remain. Removed 30 SNPs with duplicated rs numbers (530330 SNPs remain). Using N = 336974.0 Median value of Z was -0.00381962, which seems sensible. Removed 114 SNPs whose alleles did not match --merge-alleles (530216 SNPs remain).
ERROR converting summary statistics:
Traceback (most recent call last): File "./munge_sumstats.py", line 707, in munge_sumstats dat = allele_merge(dat, merge_alleles, log) File "./munge_sumstats.py", line 445, in allele_merge dat.loc[~jj, [i for i in dat.columns if i != 'SNP']] = float('nan') File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 189, in setitem indexer = self._get_setitem_indexer(key) File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 167, in _get_setitem_indexer return self._convert_tuple(key, is_setter=True) File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 248, in _convert_tuple idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter) File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 1354, in _convert_to_indexer return self._get_listlike_indexer(obj, axis, **kwargs)[1] File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 1161, in _get_listlike_indexer raise_missing=raise_missing) File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 1246, in _validate_read_indexer key=key, axis=self.obj._get_axis_name(axis))) KeyError: u"None of [Int64Index([-1, -1, -1, -1, -1, -1, -1, -1, -1, -1,\n ...\n -1, -1, -1, -1, -1, -1, -1, -1, -1, -1], dtype='int64', length=1217311)] are in the [index]"
Conversion finished at Thu Nov 19 13:11:12 2020 Total time elapsed: 18.13s Traceback (most recent call last): File "./munge_sumstats.py", line 746, in
munge_sumstats(parser.parse_args(), p=True)
File "./munge_sumstats.py", line 707, in munge_sumstats
dat = allele_merge(dat, merge_alleles, log)
File "./munge_sumstats.py", line 445, in allele_merge
dat.loc[~jj, [i for i in dat.columns if i != 'SNP']] = float('nan')
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 189, in setitem
indexer = self._get_setitem_indexer(key)
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 167, in _get_setitem_indexer
return self._convert_tuple(key, is_setter=True)
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 248, in _convert_tuple
idx = self._convert_to_indexer(k, axis=i, is_setter=is_setter)
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 1354, in _convert_to_indexer
return self._get_listlike_indexer(obj, axis, **kwargs)[1]
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 1161, in _get_listlike_indexer
raise_missing=raise_missing)
File "/home/anamaria/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/core/indexing.py", line 1246, in _validate_read_indexer
key=key, axis=self.obj._get_axis_name(axis)))
KeyError: u"None of [Int64Index([-1, -1, -1, -1, -1, -1, -1, -1, -1, -1,\n ...\n -1, -1, -1, -1, -1, -1, -1, -1, -1, -1], dtype='int64', length=1217311)] are in the [index]"