Closed lueck closed 4 years ago
This can be acomplished more easily, without joining different days, and without having to care for "cuts" between spatial units:
time_series$RateTag <- time_series$AnzahlFall / (time_series$KumAnzahlFall - time_series$AnzahlFall)
time_series$DopplTage <- log(2) / log(1 + time_series$RateTag)
time_series_clean <- time_series
time_series_clean$DopplTage[is.na(time_series_clean$DopplTage) | (time_series_clean$DopplTage > 100)] <- NA
This is because the cumulative cases the day before can be canculated from cumulative minus new cases.
EDIT: Above is the growth rate. The factor would be
time_series$KumAnzahlFall / (time_series$KumAnzahlFall - time_series$AnzahlFall)
Yes, I see. I must have been blind coming from JHU data.
The solution you describe is even more robust to changes of the interval of data acquistition, while mine requires exact intervals of 86400 seconds.
I added a function for calculating the growth factor (german: Vervielfaeltigung pro Tag, bestandsspezifischer Wachstumsfaktor) in RKI data which have been restructured with
group_RKI_timeseries()
before. It is based on KumAnzahlFall.The function works on arbitrary groupings made with
group_RKI_timeseries()
.How it is implemented:
1) Calculate the cartesian product of the restructered data and the same data. (cf. self cross join).
2) Filter rows where
Meldedatum.x == Meldedatum.y + 1 Day
3) Calculate the quotient
KumAnzahlFall.x / KumAnzahlFall.y
4) Replace
Inf
andNaN
values withNA
.5) left join to add new column to restructered data.
For performance reasons, an inner join is used instead of a cross join where possible. The inner join also allows calculation for the growth factor for arbitrary groupings made with
group_RKI_timeseries()
using theby
argument.