Open mpascariu opened 3 years ago
In fact it is not an isolated event. I can see this in California and Florida too.
Is it possible to be a date formatting issue?
Thanks for reporting Marius, your observations have been reported to the respective collectors. Date formatting is a possibility. Will let you know as soon as it's fixed.
Thanks Tim! Here's a view over all US states:
library(tidyverse)
p <- read_csv(
file = "data/Output_10_20201208.zip",
skip = 3)%>%
mutate(Date = as.Date(Date, format = "%d.%m.%Y"),
Age = as.factor(Age)) %>%
arrange(Date) %>%
filter(Sex == "b",
Country == "USA",
# Age %in% c(60, 70, 80),
Cases > 0) %>%
ggplot(aes(x = Date, y = Cases, color = Age)) +
geom_line(size = 1) +
facet_wrap(~ Region, scales = "free", ncol = 3) +
scale_y_continuous(labels = scales::label_number_si(accuracy = 0.1)) +
labs(title = "Monotonicity of confirmed cases, USA") +
theme(legend.position = "top")
ggsave("chart.png", p, width = 8, height = 18)
OK, this is a good diagnostic, going through one by one. Making a checklist.
Hi @timriffe, I can see that most of the data for the states of Iowa, California and Washington disappeared altogether from 07-01-2021 version of the database. Only few weeks of data for each state is left. Was that done on purpose?
Thanks for reporting! Not on purpose. I'm investigating these one at a time.
California and Washington look good on 08-01-2021, however Iowa data still displays major gaps between June and September.
On December 9 I was able to produce this:
Today I can see this:
Thanks @mpascariu I did a manual roll-back yesterday in Drive, as automatic captures had been failing for Iowa. Looks like I chose the wrong date. I've been in contact with the source, who tells me the sheet will be released again soon. This will completely overwrite the Iowa series, FYI. It could be a few days before that makes it through. I'll therefore roll back to the sheet status the day prior to Dec 9 and hopefully you'll get that same data back.
On Fri, Jan 8, 2021 at 12:03 PM Marius D. Pascariu notifications@github.com wrote:
California and Washington look good on 08-01-2021, however Iowa data still displays major gaps between June and September.
On December 9 I was able to produce this: [image: C19_Cases_dev_Iowa_20201209] https://user-images.githubusercontent.com/6264977/104008213-6de55100-51a9-11eb-8b22-f51246068547.png
Today I can see this: [image: C19_Cases_dev_Iowa_20210108] https://user-images.githubusercontent.com/6264977/104008262-7f2e5d80-51a9-11eb-9343-318b73b61df0.png
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/timriffe/covid_age/issues/61#issuecomment-756695535, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG43G64IAXRVROHIP3AEX3SY3RBLANCNFSM4UTGXRQA .
ok, great!
The monotonicity issues can be extended at the country level for the entire database not only for the US regions. This issue has been spotted in various countries.
But maybe a new issue should be open for this (?)
Hi @timriffe,
I am looking at the confirmed cases for Maine state and I see periods with significant jumps. I think it is an isolated event. This might need some attention.
Looking at weekly no of cases per 100k inhabitants we would see this: