Open dzimmerman-amc opened 2 years ago
You probably fixed this long ago, but for anyone else who finds this from google (like I did), the problem is invalid values in numeric columns.
An easy way to find the problem column in R is to run "read.txt("filename", header=TRUE)", and check what data type was assigned to each column. If the "n", "beta", or"p-value" columns are character type, there is your problem - a non-numeric value has somehow slipped in.
@kdack, Hello, I have same problem here with p-value column. Even I saved the file again after changing them as numeric on R, I still got same error recognizing them as character. Can you share me how to solve it?
@kdack, Hello, I have same problem here with p-value column. Even I saved the file again after changing them as numeric on R, I still got same error recognizing them as character. Can you share me how to solve it?
Ensuring all columns were numeric worked for me, but I imagine this type of error will occur for any formatting problems. Spaces, NA values perhaps.
You could try making up some demo data and checking if it works, just to see that everything is working as intended. Then take random smaller samples of your data and check if the error occurs on all of them. That would help you narrow down exactly which lines of data are causing the issue.
Hi everyone,
I am trying to munge some data for later use in ldsc and I come into this error:
/home/expcard/Projects/GWAS_SCA/GWAS_NTR/LDSC/ldsc/munge_sumstats.py \
Interpreting column names as follows: N: Sample size A1: Allele 1, interpreted as ref allele for signed sumstat. P: p-Value A2: Allele 2, interpreted as non-ref allele for signed sumstat. SNP: Variant ID (e.g., rs number) OR: Odds ratio (1 --> no effect; above 1 --> A1 is risk increasing)
Reading list of SNPs for allele merge from /home/dominicz/LDSC/w_hm3.snplist Read 1217311 SNPs for allele merge. Reading sumstats from /home/dominicz/LDSC/SCAMILIFELINESforMETALnoSNPFinalLDSCmunge.txt into memory 500000 SNPs at a time. . ERROR converting summary statistics:
Traceback (most recent call last): File "/home/expcard/Projects/GWAS_SCA/GWAS_NTR/LDSC/ldsc/munge_sumstats.py", line 686, in munge_sumstats dat = parse_dat(dat_gen, cname_translation, merge_alleles, log, args) File "/home/expcard/Projects/GWAS_SCA/GWAS_NTR/LDSC/ldsc/munge_sumstats.py", line 238, in parse_dat for block_num, dat in enumerate(dat_gen): File "/home/dominicz/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/io/common.py", line 93, in
BaseIterator.next = lambda self: self.next()
File "/home/dominicz/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/io/parsers.py", line 959, in next
return self.get_chunk()
File "/home/dominicz/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/io/parsers.py", line 1019, in get_chunk
return self.read(nrows=size)
File "/home/dominicz/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/io/parsers.py", line 982, in read
ret = self._engine.read(nrows)
File "/home/dominicz/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/io/parsers.py", line 1719, in read
data = self._reader.read(nrows)
File "pandas/_libs/parsers.pyx", line 890, in pandas._libs.parsers.TextReader.read (pandas/_libs/parsers.c:10862)
File "pandas/_libs/parsers.pyx", line 924, in pandas._libs.parsers.TextReader._read_low_memory (pandas/_libs/parsers.c:11343)
File "pandas/_libs/parsers.pyx", line 989, in pandas._libs.parsers.TextReader._read_rows (pandas/_libs/parsers.c:12175)
File "pandas/_libs/parsers.pyx", line 1117, in pandas._libs.parsers.TextReader._convert_column_data (pandas/_libs/parsers.c:14136)
File "pandas/_libs/parsers.pyx", line 1190, in pandas._libs.parsers.TextReader._convert_tokens (pandas/_libs/parsers.c:15330)
ValueError: could not convert string to float: OR
Conversion finished at Wed Nov 10 15:34:55 2021 Total time elapsed: 2.42s Traceback (most recent call last): File "/home/expcard/Projects/GWAS_SCA/GWAS_NTR/LDSC/ldsc/munge_sumstats.py", line 745, in
munge_sumstats(parser.parse_args(), p=True)
File "/home/expcard/Projects/GWAS_SCA/GWAS_NTR/LDSC/ldsc/munge_sumstats.py", line 686, in munge_sumstats
dat = parse_dat(dat_gen, cname_translation, merge_alleles, log, args)
File "/home/expcard/Projects/GWAS_SCA/GWAS_NTR/LDSC/ldsc/munge_sumstats.py", line 238, in parse_dat
for block_num, dat in enumerate(dat_gen):
File "/home/dominicz/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/io/common.py", line 93, in
BaseIterator.next = lambda self: self.next()
File "/home/dominicz/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/io/parsers.py", line 959, in next
return self.get_chunk()
File "/home/dominicz/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/io/parsers.py", line 1019, in get_chunk
return self.read(nrows=size)
File "/home/dominicz/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/io/parsers.py", line 982, in read
ret = self._engine.read(nrows)
File "/home/dominicz/.conda/envs/ldsc/lib/python2.7/site-packages/pandas/io/parsers.py", line 1719, in read
data = self._reader.read(nrows)
File "pandas/_libs/parsers.pyx", line 890, in pandas._libs.parsers.TextReader.read (pandas/_libs/parsers.c:10862)
File "pandas/_libs/parsers.pyx", line 924, in pandas._libs.parsers.TextReader._read_low_memory (pandas/_libs/parsers.c:11343)
File "pandas/_libs/parsers.pyx", line 989, in pandas._libs.parsers.TextReader._read_rows (pandas/_libs/parsers.c:12175)
File "pandas/_libs/parsers.pyx", line 1117, in pandas._libs.parsers.TextReader._convert_column_data (pandas/_libs/parsers.c:14136)
File "pandas/_libs/parsers.pyx", line 1190, in pandas._libs.parsers.TextReader._convert_tokens (pandas/_libs/parsers.c:15330)
ValueError: could not convert string to float: OR
Would anyone know how to fix this? I thought it may be caused by empty values in the OR column but there aren't any.
Thanks in advance!