statgen / pheweb

A tool to build a website to browse hundreds or thousands of GWAS.
MIT License
158 stars 65 forks source link

get_gene_aliases failing due to capitalization in header #117

Closed jgrundstad closed 5 years ago

jgrundstad commented 5 years ago

in load/make_gene_aliases_trie.py the assertion statement fails due to a difference in the source file:

r = requests.get('http://www.genenames.org/cgi-bin/download?col=gd_app_sym&col=gd_prev_sym&col=gd_aliases&col=gd_pub_ensembl_id&status=Approved&status=Entry+Withdrawn&status_opt=2&where=&order_by=gd_a
pp_sym_sort&format=text&limit=&hgnc_dbtag=on&submit=submit')

# in _parse_rows(line), assertion FAILS
assert lines[0].split('\t') == ['Approved Symbol','Previous Symbols','Synonyms','Ensembl Gene ID']

As of 12/6/18 the header at genenames.org is:

Approved symbol Previous symbols    Synonyms    Ensembl gene ID
pjvandehaar commented 5 years ago

Please update to pheweb 1.1.7 and let me know if the error still occurs. I ran into this last night and I believe I fixed it. Thanks for looking up the new header for me– I always appreciate when bug reporters do the investigation for me. I've switched to genenames.org's JSON download (hosted by EBI), which I'm hoping will change less frequently.