neherlab / covid19_scenarios_data

Data preprocessing scripts and preprocessed data storage for COVID-19 Scenarios project
https://github.com/neherlab/covid19_scenarios
Other
41 stars 34 forks source link

Enhancement: Utilize case-count data inside populationData.tsv #15

Closed nnoll closed 4 years ago

nnoll commented 4 years ago

I think it would improve the utility and usability of the model if we did something a bit more intelligent regarding the population estimates utilized in populationData.tsv as many countries have staggered epidemics and varied testing capacities. I think the manual nature of filling in these initial case counts can be massively improved. I'll propose three alternatives:

  1. Replace suspectedCasesMarch1st with the first date SARS-CoV-2 was detected within each country. The benefit is this is a rather simple change.
  2. Fit a few select parameters of the model to the case-count data we have. Importantly this must be kept rather simple ,e.g. fit the date of first introduction and the % cases caught within a country.
  3. Keep the format the same but dynamically fill in suspectedMarch1st cases with empirical values.
rneher commented 4 years ago

we could also consider having tMin be part of this. If we include Chinese provinces, March 1st is no sensible...

nnoll commented 4 years ago

Yeah, sorry if this was unclear, this is what I meant w/ the first point. I think tMin and numberOfCases at tMin are good variables to include in populationData

rneher commented 4 years ago

I started writing a little script that fits the march1st number to the case counts. not ideal, better than before, still WIP https://github.com/neherlab/covid19_scenarios_data/blob/feat/fit-initial-numbers/fit_initial_cases.py