nextstrain / augur

Pipeline components for real-time phylodynamic analysis
https://docs.nextstrain.org/projects/augur/
GNU Affero General Public License v3.0
268 stars 128 forks source link

augur curate: improve metadata header parsing #1553

Open joverlee521 opened 1 month ago

joverlee521 commented 1 month ago

First noticed in https://github.com/nextstrain/augur/pull/1550#discussion_r1683425790 our blind use of first row as headers can cause misleading errors.

We should assert that the headers are unique, non-empty values since augur curate converts each record into a Python dictionary.

tsibley commented 1 month ago

We should assert that the headers are unique, non-empty values since augur curate converts each record into a Python dictionary.

Alternatively, we could name unnamed or duplicate named columns by their index. This lets a user fix the issue within augur curate instead of blocking them.