maxsal / covid19india

Pull clean data on COVID-19 in India from covid19india.org for use in R!
https://cran.r-project.org/package=covid19india
Other
0 stars 0 forks source link

`check_for_data_correction` for arbitrary variables and fill option #5

Open maxsal opened 2 years ago

maxsal commented 2 years ago

The check_for_data_correction.R script should be modified in two ways.

First, line 28 is hardcoded to fill in daily_cases even though the argument var lets you specify an arbitrary column. Line 28 should allow you to fill in the column with NA based on the var variable.

Second, we should add a fill argument such that if TRUE, the variable will fill in missing values with the last observation carried forward (see data.table::nafill(type = "locf"))

mkleinsa commented 2 years ago

Should the argument var by default take a vector of columns c(cases, deaths, recovered) since we're typically filling for all three variables? We're checking the condition on cases and assuming that the other two variables act the same way. Is that a valid assumption?

mkleinsa commented 2 years ago

chec_for_data_correction() was updated to check a single variable (usually daily_cases) for validation, then uses all three variables by default to fill missing values. Ended up hard-coding those, but we could also send them as a vector. I don't see an advantage to an argument.

Also re-ordered the logic of get_nat_counts()