joachim-gassen / tidycovid19

{tidycovid19}: An R Package to Download, Tidy and Visualize Covid-19 Related Data
https://joachim-gassen.github.io/tidycovid19/
Other
146 stars 44 forks source link

download_merged_data(cached=FALSE) doesn't work #27

Closed AndreaPi closed 3 years ago

AndreaPi commented 4 years ago

Since the data cache in GitHub hasn't been updated in the last 2 days, I tried to use

download_merged_data(cached=FALSE)

It fails with the following error:

.
.
.
Start downloading Google Trends data

Pulling Google trend data for IT ...Error in `[<-.data.frame`(`*tmp*`, , timevar, value = "subject") : 
  replacement has 1 row, data has 0
In addition: Warning messages:
1: In countrycode::countrycode(.data$`Country/Region`, origin = "country.name",  :
  Some values were not matched unambiguously: Diamond Princess, Kosovo, MS Zaandam

2: In countrycode::countrycode(.data$`Country/Region`, origin = "country.name",  :
  Some values were not matched unambiguously: Diamond Princess, Kosovo, MS Zaandam

3: In countrycode::countrycode(.data$`Country/Region`, origin = "country.name",  :
  Some values were not matched unambiguously: Diamond Princess, Kosovo, MS Zaandam

4: In countrycode::countrycode(.data$country, origin = "country.name",  :
  Some values were not matched unambiguously: Diamond Princess, Kosovo, MS Zaandam

Maybe the Google Trends API/data format just changed? Can you fix it, or otherwise add an option to make downloading the Google Trends optional? Thanks!

joachim-gassen commented 4 years ago

Thank you for this! It seems related to #26 but now I know which function is causing the problem. When you have the time could you rerun the code and send me a traceback()? Thanks again!

joachim-gassen commented 4 years ago

btw: I just updated the data so download_merged_data(cached = TRUE) should give you an updated dataset in case you need it.

AndreaPi commented 4 years ago

Here's the result of a traceback():

> download_merged_data(cached=FALSE)
Start downloading JHU CSSE Covid-19 data
.
.
.
Start downloading Google Trends data

Pulling Google trend data for IT ...Error in `[<-.data.frame`(`*tmp*`, , timevar, value = "subject") : 
  replacement has 1 row, data has 0
In addition: Warning messages:
1: In countrycode::countrycode(.data$`Country/Region`, origin = "country.name",  :
  Some values were not matched unambiguously: Diamond Princess, Kosovo, MS Zaandam

2: In countrycode::countrycode(.data$`Country/Region`, origin = "country.name",  :
  Some values were not matched unambiguously: Diamond Princess, Kosovo, MS Zaandam

3: In countrycode::countrycode(.data$`Country/Region`, origin = "country.name",  :
  Some values were not matched unambiguously: Diamond Princess, Kosovo, MS Zaandam

4: In countrycode::countrycode(.data$country, origin = "country.name",  :
  Some values were not matched unambiguously: Diamond Princess, Kosovo, MS Zaandam

> traceback()
18: stop(sprintf(ngettext(N, "replacement has %d row, data has %d", 
        "replacement has %d rows, data has %d"), N, n), domain = NA)
17: `[<-.data.frame`(`*tmp*`, , timevar, value = "subject")
16: `[<-`(`*tmp*`, , timevar, value = "subject")
15: FUN(X[[i]], ...)
14: lapply(seq_along(times), function(i) {
        d[, timevar] <- times[i]
        varying.i <- vapply(varying, `[`, i, FUN.VALUE = character(1L))
        d[, v.names] <- data[, varying.i]
        if (is.null(new.row.names)) 
            row.names(d) <- paste(ids, times[i], sep = ".")
        else row.names(d) <- new.row.names[(i - 1L) * NROW(d) + 1L:NROW(d)]
        d
    })
13: do.call(rbind, lapply(seq_along(times), function(i) {
        d[, timevar] <- times[i]
        varying.i <- vapply(varying, `[`, i, FUN.VALUE = character(1L))
        d[, v.names] <- data[, varying.i]
        if (is.null(new.row.names)) 
            row.names(d) <- paste(ids, times[i], sep = ".")
        else row.names(d) <- new.row.names[(i - 1L) * NROW(d) + 1L:NROW(d)]
        d
    }))
12: reshapeLong(data, idvar = idvar, timevar = timevar, varying = varying, 
        v.names = v.names, drop = drop, times = times, ids = ids, 
        new.row.names = new.row.names)
11: reshape(df, varying = tolower(colnames(df)[2]), v.names = "value", 
        direction = "long", timevar = "related_topics", times = tolower(colnames(df)[2]))
10: FUN(X[[i]], ...)
9: lapply(index, extract_related_topics, raw_data = raw_data)
8: FUN(X[[i]], ...)
7: lapply(i, create_related_topics_payload, widget = widget, hl = hl, 
       tz = tz)
6: related_topics(widget, comparison_item, hl, tz)
5: gtrendsR::gtrends(search_term, geo = iso2c, time = time)
4: FUN(X[[i]], ...)
3: lapply(gtrends_global$iso2c, pull_gt_country_data)
2: download_google_trends_data(search_term, c("country_day", "country"), 
       silent = silent)
1: download_merged_data(cached = FALSE)
joachim-gassen commented 4 years ago

Thanks! This is an issue in the {gtrendsR} package. I opened a PR on the repo that should fix the problem (https://github.com/PMassicotte/gtrendsR/pull/353) but we will have to wait and see what the maintainer thinks about it. If you need to pull merged data directly (not from the cache) you could install my fork of the {gtrendsR} package that contains the fix.

remotes::install_github("joachim-gassen/gtrendsR")

Use it at you own risk though as I cannot sure whether my 'fix' in the {gtrendsR} package accidentally screws up something else.

I will leave this issue open until the underlying issue in {gtrendsR} is fixed.

AndreaPi commented 4 years ago

For now I'll go with the cached data since you updated it, but thanks a bunch for the PR, let's hope it gets merged soon! Great job 😀

joachim-gassen commented 3 years ago

Uups. Stale Issue. Closed.