> read_parquet("https://covid-clade-counts.s3.amazonaws.com/2024-10-14_covid_clade_counts.parquet") |>
+ filter(is.na(date))
# A tibble: 197 × 4
location date clade count
<chr> <date> <chr> <int>
1 South Dakota NA 23I 3
2 Virginia NA 21H 12
3 Mississippi NA 20I 8
4 Virginia NA 23H 12
5 Louisiana NA 22C 1
6 Virginia NA 23E 7
7 Maryland NA 22E 1
8 South Dakota NA 20A 504
9 Nebraska NA 21C 2
10 Maryland NA 21F 1
# ℹ 187 more rows
# ℹ Use `print(n = ...)` to see more rows
Clade counts have rows with a missing
date
:The above file was generated using this script.
Do we have an understanding of how these missing dates arise? Is this a bug we need to fix, or something about the source data?