Closed semohr closed 4 years ago
Not too sure what you mean by this, do you want to calculate the summed cases in the interval [begin_date, end_date]? With a reference to the period of time of the summed up cases (date_end - date_begin) in days? I could create a method that does this.
I think if you use df.diff, the number of rows of the table decreases by one. Then there is the question on how we deal with it and also how we try to be somewhat consistent in what the cumulative and the new_* function returns. I would argue, that the most sensible is to exclude the first date from the returned rows. So the new cases are simply calculated by taking the difference between neigbouring rows (df.diff) and exclude the first date index from the results. I don't know what your current implementation is doing exactly and what df.diff is doing in that respect. For the cumulative function, I would then also exclude the first row of the results. Otherwise it seems to look good.
The current implementation of the new_* functions should be doing exactly what you described. :thumbsup:
Fixes #8
Additionally added a method in the jhu source, that gives a list of all possible countries and states.