Closed trvrb closed 5 months ago
I pushed up a small change to match the threshold numbers in the viz app.
I pushed up a small change to match the threshold numbers in the viz app.
Thanks for the catch @joverlee521. I'm going to go ahead and merge this now.
This PR drops location count threshold (ie the number of sequences collected in the past 30 days) from 100 to 50 for clade-level analysis and from 300 to 150 for lineage-level analysis.
With current data this goes from 8 locations included for clades to 11 locations included.
With current data this goes from 5 locations included for lineages to 7 locations included.
To support these thresholds, I looked at location count for different countries analyzed in bedford.io/papers/abousamra-ncov-forecasting-fit/ to get specific count thresholds. We see:
I believe this suggests that a threshold of 50 sequences in previous 30 days should be roughly consistent with a ~10% forecasting error. This seems like an okay threshold for public display.
It's less certain what count threshold to use for lineages where we have significantly larger number of labels than we do for clades. Keeping a 3x ratio here for now.