nextstrain / augur

Pipeline components for real-time phylodynamic analysis
https://docs.nextstrain.org/projects/augur/
GNU Affero General Public License v3.0
268 stars 129 forks source link

Poorly formatted metadata TSV leads to KeyError #1572

Closed joverlee521 closed 1 month ago

joverlee521 commented 1 month ago

First seen in Nextstrain office hours

If a metadata TSV file is poorly formatted (e.g. extra tab in values), this can cause a misleading KeyError

Traceback (most recent call last):
  File "/nextstrain/augur/augur/__init__.py", line 66, in run
    return args.__command__.run(args)
  File "/nextstrain/augur/augur/filter/__init__.py", line 109, in run
    return _run(args)
  File "/nextstrain/augur/augur/filter/_run.py", line 180, in run
    for metadata in metadata_reader:
  File "/usr/local/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1698, in __next__
    return self.get_chunk()
  File "/usr/local/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1810, in get_chunk
    return self.read(nrows=size)
  File "/usr/local/lib/python3.10/site-packages/pandas/io/parsers/readers.py", line 1778, in read
    ) = self._engine.read(  # type: ignore[attr-defined]
  File "/usr/local/lib/python3.10/site-packages/pandas/io/parsers/c_parser_wrapper.py", line 276, in read
    values = data.pop(self.index_col[i])
KeyError: 'strain'

An error occurred (see above) that has not been properly handled by Augur.
To report this, please open a new issue including the original command and the error above:
    <https://github.com/nextstrain/augur/issues/new/choose>
joverlee521 commented 1 month ago

Ah, duplicate of https://github.com/nextstrain/augur/issues/1195