hakyimlab / summary-gwas-imputation

harmonization, liftover, and imputation of summary statistics from GWAS
MIT License
31 stars 20 forks source link

Error in gwas harmonization step #12

Closed rosoffdb closed 3 years ago

rosoffdb commented 3 years ago

Hello -

I'm working through the tutorial and trying out the new COVID-19 gwas summary statistics data. I keep getting the error below:

python ./summary-gwas-imputation-master/src/gwas_parsing.py \ -gwas_file ./Desktop/gwastools/covida2.preptest.txt \ -liftover ./DATA/liftover/hg19ToHg38.over.chain.gz \ -snp_reference_metadata ./DATA/reference_panel_1000G/variant_metadata.txt.gz METADATA \ -output_column_map markername variant_id \ -output_column_map noneffect_allele non_effect_allele \ -output_column_map effect_allele effect_allele \ -output_column_map beta effect_size \ -output_column_map p_dgc pvalue \ -output_column_map chr chromosome \ --chromosome_format \ -output_column_map bp_hg19 position \ -output_column_map effect_allele_freq frequency \ --insert_value sample_size 709010 --insert_value n_cases 5582 \ -output_order variant_id panel_variant_id chromosome position effect_allele non_effect_allele frequency pvalue zscore effect_size standard_error sample_size n_cases \ -output ./Desktop/gwastools/covida2.harmonized.txt.gz

INFO - Parsing input GWAS Traceback (most recent call last): File "./summary-gwas-imputation-master/src/gwas_parsing.py", line 311, in run(args) File "./summary-gwas-imputation-master/src/gwas_parsing.py", line 258, in run enforce_numeric_columns=args.enforce_numeric_columns) File "/Users/XYZ/Desktop/gwastools/summary-gwas-imputation-master/src/genomic_tools_lib/file_formats/gwas/GWAS.py", line 18, in load_gwas d = _ensure_columns(d, input_pvalue_fix, enforce_numeric_columns) File "/Users/XYZ/Desktop/gwastools/summary-gwas-imputation-master/src/genomic_tools_lib/file_formats/gwas/GWAS.py", line 31, in _ensure_columns d[EFFECT_ALLELE] = d[EFFECT_ALLELE].str.upper() File "/usr/local/anaconda3/envs/imlabtools/lib/python3.7/site-packages/pandas/core/generic.py", line 5175, in getattr return object.getattribute(self, name) File "/usr/local/anaconda3/envs/imlabtools/lib/python3.7/site-packages/pandas/core/accessor.py", line 175, in get accessor_obj = self._accessor(obj) File "/usr/local/anaconda3/envs/imlabtools/lib/python3.7/site-packages/pandas/core/strings.py", line 1917, in init self._inferred_dtype = self._validate(data) File "/usr/local/anaconda3/envs/imlabtools/lib/python3.7/site-packages/pandas/core/strings.py", line 1967, in _validate raise AttributeError("Can only use .str accessor with string " "values!") AttributeError: Can only use .str accessor with string values!

I've tried several ways of loading the gwastools packages (standard install, specifying python 3.5, etc) and tweaked the summary statistic files several times, too. Notably, I don't receive this error using the quick harmonization method outlined in the MetaXcan MASHR GTEx V8 tutorials, so I don't think it relates to the summary GWAS file. Any help and guidance would be greatly appreciated!

Very respectfully, Dan

rosoffdb commented 3 years ago

Hi -

I believe I found the problem - the summary statistic was not tab separated. Thank you!