nextstrain / fauna

RethinkDB database to support real-time virus analysis
GNU Affero General Public License v3.0
33 stars 13 forks source link

tdb/cdc_upload: Add filter option for new titers #107

Closed joverlee521 closed 3 years ago

joverlee521 commented 3 years ago

The CDC is moving to a new table format that includes all titer data. We should only ingest the titers that are marked reportable.

The new --filter option filters for rows where titer_reportable is True and prints the filtered titers to a new TSV with columns that have been renamed according to source-data/cdc_titer_column_map.tsv. The column names should match previous CDC titer table columns so that the cdc_upload class does not need to be updated.


joverlee521 commented 3 years ago

CDC replied that they now filter out erroneous data, so we can ingest all titer data in the table. Updated the new option to --rename since the only function it does now is to rename the columns.