statgen / pheweb

A tool to build a website to browse hundreds or thousands of GWAS.
MIT License
158 stars 66 forks source link

pheweb parsing errors #165

Closed jzluo closed 8 months ago

jzluo commented 3 years ago

Hi, running into an issue I think with parse-input-files:

Please include:

Child process had exception, info dumped to ~/pheweb/generated-by-pheweb/tmp/exception-2021-06-08T15-04-24.418222 (Details in ~/pheweb/generated-by-pheweb/tmp/exception-2021-06-08T15-04-24.431163)

It hangs at this point.

$ cat ~/pheweb/generated-by-pheweb/tmp/exception-2021-06-08T15-04-24.431163 ======= Exception ==== Child process had exception, info dumped to ~/pheweb/generated-by-pheweb/tmp/exception-2021-06-08T15-04-24.418222

======= Traceback ==== Traceback (most recent call last): File "/home/jon/.local/lib/python3.7/site-packages/pheweb/command_line.py", line 131, in main run(sys.argv[1:]) File "/home/jon/.local/lib/python3.7/site-packages/pheweb/command_line.py", line 121, in run handlerssubcommand File "/home/jon/.local/lib/python3.7/site-packages/pheweb/command_line.py", line 62, in f module_run(argv) File "/home/jon/.local/lib/python3.7/site-packages/pheweb/load/sites.py", line 63, in run manna.apply_ret(ret) File "/home/jon/.local/lib/python3.7/site-packages/pheweb/load/sites.py", line 100, in apply_ret raise PheWebError('Child process had exception, info dumped to {}'.format(exc_filepath)) pheweb.utils.PheWebError: Child process had exception, info dumped to ~/pheweb/generated-by-pheweb/tmp/exception-2021-06-08T15-04-24.418222

$ cat ~/pheweb/generated-by-pheweb/tmp/exception-2021-06-08T15-04-24.418222 Child process had exception: (['6', '351611.0', '6.7', '0.00011'], ['chrom', 'pos', 'ref', 'alt', 'pval', 'beta', 'sebeta', 'maf']) Traceback: Traceback (most recent call last): File "/home/jon/.local/lib/python3.7/site-packages/pheweb/load/sites.py", line 131, in mp_target for ret in merge(task['files_to_merge'], task['out_filepath']): File "/home/jon/.local/lib/python3.7/site-packages/pheweb/load/sites.py", line 187, in merge new_v = next(readers[reader_id]) File "/home/jon/.local/lib/python3.7/site-packages/pheweb/file_utils.py", line 169, in _get_variants assert len(unparsed_variant) == len(self._all_fields), (unparsed_variant, self._all_fields) AssertionError: (['6', '351611.0', '6.7', '0.00011'], ['chrom', 'pos', 'ref', 'alt', 'pval', 'beta', 'sebeta', 'maf'])

- snippets of relevant files, especially files mentioned in the error.
- 
Snippet of parsed file:

$ zcat ~/pheweb/generated-by-pheweb/parsed/008_52 | grep "6 351611" 6 35161144 G T 0.026 3.0 1.1 0.00014 6 351611.0 6.7 0.00011 6 3516110.016 .3 0.0 01 0.04

gzip: 008_52: invalid compressed data--format violated


Snippet of input file:

6,331945,C,A,0.7636440987725183,0.000107198,-1.02973,3.42447 6,348277,A,G,0.7509724415781722,0.000260193,-1.01563,3.20025 6,365986,A,G,0.8270841534049443,0.000122692,-1.01508,4.6469

- your `config.py`.

hg_build_number = 38 show_manhattan_filter_consequence = True show_manhattan_filter_button=True



The parsing otherwise seems to have no problem for most of the file from what I can see
pjvandehaar commented 3 years ago

This is a great bug report, thanks.

Can you show me the line in your input file with 6,351611?

And also the line with 3516110.016? pos is treated as an integer by pheweb, so I don't understand why it wrote out 3516110.016. That's probably related to the problem.

Could you show me the output of zcat ~/pheweb/generated-by-pheweb/parsed/008_52 | grep "6 351611" | hexdump -C? Perhaps there are more tabs in there that didn't survive the copy-paste.