monarch-initiative / dipper

Data Ingestion Pipeline for Monarch
https://dipper.readthedocs.io/en/latest/
BSD 3-Clause "New" or "Revised" License
56 stars 26 forks source link

Unable to download HGNC dataset #903

Closed Prakash2403 closed 4 years ago

Prakash2403 commented 4 years ago

Command executed: dipper-etl.py --sources hgnc

Log

INFO:__main__:Will Skolemize Blank Nodes
test_curieprefixes (tests.test_general.GeneralGraphTestCase) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.023s

OK
INFO:__main__:
******* hgnc *******
WARNING:dipper.config:'dipper/conf.yaml' not found in '/home/prakash/.local/lib/python3.7/site-packages/dipper'
WARNING:dipper.config:Sources that depend on 'conf.yaml' will fail
Traceback (most recent call last):
  File "/home/prakash/.local/bin/dipper-etl.py", line 285, in <module>
    main()
  File "/home/prakash/.local/bin/dipper-etl.py", line 210, in main
    imported_module = importlib.import_module(module)
  File "/usr/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 967, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/prakash/.local/lib/python3.7/site-packages/dipper/sources/HGNC.py", line 5, in <module>
    from dipper.sources.OMIMSource import OMIMSource
  File "/home/prakash/.local/lib/python3.7/site-packages/dipper/sources/OMIMSource.py", line 11, in <module>
    MONARCHIVE = MONARCHURL + config.get_config()['keys']['monarchive']
KeyError: 'monarchive'
TomConlin commented 4 years ago

Legally you need your own API/Download key for OMIM (which the dipper HGNC ingest uses) we are not permitted to make their files available. see: https://www.omim.org/downloads/

Prakash2403 commented 4 years ago

@TomConlin Okay, but it is failing due to the absence of Monarch Archive Key. Do I need a key from Monarch Archive too?

TomConlin commented 4 years ago

No it is just a local copy of the exact same OMIM file so we avoid excessive re fetching from them. It is relatively new feature and I likely need to make cache misses more transparent if you are noticing. (I had no idea anyone outside of Monarch ever ran Dipper)

Also: dippers HGNC dataset is available to download from https://data.monarchinitiative.org/dev/

Prakash2403 commented 4 years ago

@TomConlin Thanks for pointing to the data source :)