nextstrain / fauna

RethinkDB database to support real-time virus analysis
GNU Affero General Public License v3.0
33 stars 13 forks source link

Decouple data curation and upload #162

Open joverlee521 opened 2 months ago

joverlee521 commented 2 months ago

Additional context in Slack

A part of the data curation occurs during vdb/upload and tdb/upload making it difficult to debug data curation issues and hard to share data curation steps with external groups.

Potential solutions

  1. Detangle data curation and data upload within fauna.
  2. Start brand new ingest workflows for curation where the results are then optionally uploaded to fauna.