artic-network / civet

Cluster Investigation & Virus Epidemiology Tool
https://cov-lineages.org/resources/civet.html
GNU General Public License v3.0
48 stars 14 forks source link

Nottinghamshire gets mapped to `SUTTON-IN-ASHFIELD` for some reason #74

Closed rambaut closed 3 years ago

rambaut commented 3 years ago

The adm2 mapping file, adm2_cleaning.csv has the following line: SUTTON-IN-ASHFIELD,Nottinghamshire,,,,,,,,,

Which produces a key error in a later look up.

Traceback (most recent call last):
  File "/Users/rambaut/opt/miniconda3/envs/civet/bin/local_scale_analysis.py", line 4, in <module>
    __import__('pkg_resources').run_script('civet==2.1.0', 'local_scale_analysis.py')
  File "/Users/rambaut/opt/miniconda3/envs/civet/lib/python3.6/site-packages/pkg_resources/__init__.py", line 665, in run_script
    self.require(requires)[0].run_script(script_name, ns)
  File "/Users/rambaut/opt/miniconda3/envs/civet/lib/python3.6/site-packages/pkg_resources/__init__.py", line 1463, in run_script
    exec(code, namespace, namespace)
  File "/Users/rambaut/opt/miniconda3/envs/civet/lib/python3.6/site-packages/civet-2.1.0-py3.6.egg/EGG-INFO/scripts/local_scale_analysis.py", line 640, in <module>
    Central_HB_code=adm2_to_centralHBCode(inputSamples, HBTranslation, HBCode_name_translation)
  File "/Users/rambaut/opt/miniconda3/envs/civet/lib/python3.6/site-packages/civet-2.1.0-py3.6.egg/EGG-INFO/scripts/local_scale_analysis.py", line 318, in adm2_to_centralHBCode
    sampleframe['adm2'] = sampleframe['adm2'].str.upper()
  File "/Users/rambaut/opt/miniconda3/envs/civet/lib/python3.6/site-packages/pandas/core/frame.py", line 2899, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/Users/rambaut/opt/miniconda3/envs/civet/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 2891, in get_loc
    raise KeyError(key) from err
KeyError: 'adm2'

Removing this line fixes the issue. But a cleaner error with details of what location couldn't be found would be useful.

rambaut commented 3 years ago

It may be that this mapping was supposed to be in the other direction. I.e., SUTTON-IN-ASHFIELD should map to EAST MIDLANDS in the later line in this file so this needs to be checked.