Issues with WIS data - Githubissues

[ ] City/Neighborhood have no valid confidence levels A village with 200 people, all of them reporting daily will have the same confidence level as a city with 1M residents and 200 reports per day. This creates a bias towards big cities - large cities are more likely to appear even with a low reporting rate.
[ ] Hard cut-offs for confidence City rankings are done by sorting on SR (symptoms ratio) for all cities with more than 200 reports. This means that a city that had 201 reports and was in the #1 spot in the city ranking, might disappear from the ranking the next day without any explanation just because it got "only" 199 reports. Ranking algorithms should avoid hard cut-offs and use confidence levels to create a graceful degradation in case the report count drops.
[ ] Values are aggregated over X days To reduce the noise, WIS are averaging all values over 10 days (at the moment). While this indeed reduces the noise it also hides spikes in SR in certain areas. We would like to show the "smoothed" values as well as the daily raw values for the SR and report counts. Atm, WIS had agreed to include just the daily # of reporters but not the daily SR values.
[ ] SR numbers are hard to explain The numbers received are a result of processing and do not correspond with anything tangible that might be easily communicated to end users (WIS explanation was: "processing of the symptoms that is based on the symptoms prevalence in COVID19 confirmed patients"). e.g. questions from end users such as "what is '4' and how worse it is compared to '3'"?

hasadna / avid-covider

Issues with WIS data #271