Without this filling, days with no sequences for any of the variants will have NA frequency for all variants since there are NAs in the computation. You can see this by printing the new_truth immediately after. There should be 0 NAs here since this is constructed to fill in data for counts from all locations, variants, and dates and assume they are 0 if absent from the data. I've reinstated the fillna(0) poriton to fix the issue.
Second, I've switched the merge order, so predictions are only merged if they correspond to a date, variant, location pair in our truth set.
Lastly, I've re-implemented the changes we discussed previously to make sure that we do not floor the raw or smoothed frequencies to 0 when they are NA.
Bonus: I changed a variable name from final_set -> merged to make it more clear what is actually going into this function.
There's a couple of issues with these updates.
First seems to be removing the filling of NAs in the retrospective sequence counts.
Without this filling, days with no sequences for any of the variants will have NA frequency for all variants since there are NAs in the computation. You can see this by printing the
new_truth
immediately after. There should be 0 NAs here since this is constructed to fill in data for counts from all locations, variants, and dates and assume they are 0 if absent from the data. I've reinstated thefillna(0)
poriton to fix the issue.Second, I've switched the merge order, so predictions are only merged if they correspond to a date, variant, location pair in our truth set.
Lastly, I've re-implemented the changes we discussed previously to make sure that we do not floor the raw or smoothed frequencies to 0 when they are NA.
Bonus: I changed a variable name from
final_set
->merged
to make it more clear what is actually going into this function.