Closed dankelley closed 4 years ago
This doesn't seem to be related to the package... anyway here the solution:
library(COVID19)
d <- covid19(end=Sys.Date()-1)
cat("World:\n ", max(d$deaths), "deaths\n")
for (country_name in c("Australia", "Canada", "United Kingdom", "United States")) {
cat(country_name, ":\n", sep="")
sub1 <- subset(d, country == country_name)
cat(" method 1 reveals ", max(sub1$deaths), "deaths\n")
sub2 <- d[d$country == country_name, ]
cat(" method 2 reveals ", max(sub2$deaths), "deaths\n")
}
When you run subset(d, d$country == country)
, the variable country
is the column of d
, not the country
variable you defined above. See the documentation of ?subset
Thanks. I just ran your suggested code. It updated COVID19, and I got as below. Do you get similar? I notice 0 deaths for two countries that have had deaths, and for the US, I get the same as for the world.
I'm sorry to be a bother.
World:
56259 deaths
Australia:
method 1 reveals 0 deaths
method 2 reveals 0 deaths
Canada:
method 1 reveals 0 deaths
method 2 reveals 0 deaths
United Kingdom:
method 1 reveals 21092 deaths
method 2 reveals 21092 deaths
United States:
method 1 reveals 56259 deaths
method 2 reveals 56259 deaths
First, I apologize that this issue is quite long. You can basically see my problem by looking at the code and output blocks at the bottom. I think there may be a problem with COVID19 that did not exist yesterday.
I'm wondering whether some has changed very recently with COVID19, in the
deaths
column. Below is some code that shows unexpected results. I am not sure whether this is a difficulty in howsubset
is working, how[
is working, or perhaps in thedeaths
column. I am not familiar with working with tibbles, having started using R long before they were invented, so maybe both my trial methods for extracting data are faulty?NOTE: I am not querying by ISO codes for country names, because I simply don't know all the names, whereas I do know the actual names. Also, I'm doing this for nearly 200 countries, and I fear that calling
covid19()
that many times will be slow.My confusion points are
[
andsubset
give different results?subset
give incorrect results (i.e. max per country is identical to max per world)[
work so differently for different countriesAs a clue, I am pretty sure the results I am getting this morning are different from those I got yesterday; the previous results were not giving 0 deaths in countries where I know for sure there have been deaths.
The R code
gives output