Open coleeagland opened 1 year ago
I have looked rapidly and there are not so much difference between our query and Google.
Google query using the webpage:
{
"time": "2017-12-06 2022-12-06",
"resolution": "WEEK",
"locale": "en-US",
"comparisonItem": [
{
"geo": { "country": "US" },
"complexKeywordsRestriction": {
"keyword": [{ "type": "BROAD", "value": "crib" }]
}
}
],
"requestOptions": { "property": "", "backend": "IZG", "category": 0 },
"userConfig": { "userType": "USER_TYPE_LEGIT_USER" }
}
Our query:
{
"time": "2017-12-06 2022-12-06",
"resolution": "WEEK",
"locale": "en-US",
"comparisonItem": [
{
"geo": { "country": "US" },
"complexKeywordsRestriction": {
"keyword": [{ "type": "BROAD", "value": "crib" }]
}
}
],
"requestOptions": { "category": 0, "backend": "IZG", "property": "" },
"userConfig": { "userType": "USER_TYPE_SCRAPER" },
}
The only difference I can see is the userType
. It seems that Google is able to detect that we are scraping their data. I could not find how to bypass this, but I suspect this is related to the request of the token: https://github.com/PMassicotte/gtrendsR/blob/d53b9b7bd448180ab8640cba1db07065ff60ab83/R/zzz.R#L113
If anyone has a solution, I would be happy to look at it.
Hi,
Many thanks @PMassicotte for your excellent work on this package. It's greatly appreciated.
I have a similar issue for the keyword "lyme" for France.
` lyme <- gtrends( keyword = "lyme", geo = "FR", time = "2004-01-01 2022-12-01", gprop = c("web"), onlyInterest = TRUE )$interest_over_time
plot(lyme$date, lyme$hits, type = "l", ylim = c(0, 100))`
return
There are no hits between 2010 and 2014 with gtrendsR despite there are hits on the Google Trends website
I tried on a different computer and different versions of gtrendsR. There is no issue when using the keyword "maladie de lyme". There is no issue for some other countries.
Do you have a similar observation from your side ?
Best, Charles
Same problem here!
Hi again,
Problem solved for France, using the exact same code and same versions or R, RStudio, gtrendsR.
But the the problem appeared for the UK (ISO2 "GB") for "lyme" keyword.
As for France, using "lyme disease" solves the problem in the UK.
Best, Charles
I'm encountering this same issue on an unrelated search term ("inflation"
)
Same issue happening recently. It does not depend on the particular search term as it has happened for a different number of them depending on the moment when I send the query. Below you can find an example of a wide variety of them applied to telco / insurance terms, where the query is written as gtrendsR::gtrends(term, geo = "GB", time = "today+5-y")
.
It might be related to the UK only, but I also searched for US terms like "Biden" and got blocks of zeros.
Configuration:
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.3.1 (2023-06-16 ucrt)
#> os Windows 10 x64 (build 18362)
#> system x86_64, mingw32
#> ui RTerm
#> language (EN)
#> collate English_United Kingdom.utf8
#> ctype English_United Kingdom.utf8
#> tz Europe/Madrid
#> date 2024-01-02
#> pandoc 3.1.8 @ C:/Users/ALBERT~1.AGU/AppData/Local/Pandoc/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> cachem 1.0.8 2023-05-01 [1] CRAN (R 4.3.1)
#> callr 3.7.3 2022-11-02 [1] CRAN (R 4.3.1)
#> cli 3.6.1 2023-03-23 [1] CRAN (R 4.3.1)
#> crayon 1.5.2 2022-09-29 [1] CRAN (R 4.3.1)
#> devtools 2.4.5 2022-10-11 [1] CRAN (R 4.3.1)
#> digest 0.6.33 2023-07-07 [1] CRAN (R 4.3.1)
#> ellipsis 0.3.2 2021-04-29 [1] CRAN (R 4.3.1)
#> evaluate 0.22 2023-09-29 [1] CRAN (R 4.3.1)
#> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.3.1)
#> fs 1.6.3 2023-07-20 [1] CRAN (R 4.3.1)
#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.3.1)
#> gtrendsR * 1.5.1.9000 2023-11-23 [1] Github (pmassicotte/gtrendsR@d53b9b7)
#> htmltools 0.5.6 2023-08-10 [1] CRAN (R 4.3.1)
#> htmlwidgets 1.6.2 2023-03-17 [1] CRAN (R 4.3.1)
#> httpuv 1.6.11 2023-05-11 [1] CRAN (R 4.3.1)
#> knitr 1.44 2023-09-11 [1] CRAN (R 4.3.1)
#> later 1.3.1 2023-05-02 [1] CRAN (R 4.3.1)
#> lifecycle 1.0.4 2023-11-07 [1] CRAN (R 4.3.2)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.1)
#> memoise 2.0.1 2021-11-26 [1] CRAN (R 4.3.1)
#> mime 0.12 2021-09-28 [1] CRAN (R 4.3.0)
#> miniUI 0.1.1.1 2018-05-18 [1] CRAN (R 4.3.1)
#> pkgbuild 1.4.2 2023-06-26 [1] CRAN (R 4.3.1)
#> pkgload 1.3.3 2023-09-22 [1] CRAN (R 4.3.1)
#> prettyunits 1.2.0 2023-09-24 [1] CRAN (R 4.3.1)
#> processx 3.8.2 2023-06-30 [1] CRAN (R 4.3.1)
#> profvis 0.3.8 2023-05-02 [1] CRAN (R 4.3.1)
#> promises 1.2.1 2023-08-10 [1] CRAN (R 4.3.1)
#> ps 1.7.5 2023-04-18 [1] CRAN (R 4.3.1)
#> purrr 1.0.2 2023-08-10 [1] CRAN (R 4.3.1)
#> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.3.1)
#> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.3.0)
#> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.3.0)
#> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.3.1)
#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.3.1)
#> Rcpp 1.0.11 2023-07-06 [1] CRAN (R 4.3.1)
#> remotes 2.4.2.1 2023-07-18 [1] CRAN (R 4.3.1)
#> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.3.1)
#> rlang 1.1.1 2023-04-28 [1] CRAN (R 4.3.1)
#> rmarkdown 2.25 2023-09-18 [1] CRAN (R 4.3.1)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.1)
#> shiny 1.7.5 2023-08-12 [1] CRAN (R 4.3.1)
#> stringi 1.7.12 2023-01-11 [1] CRAN (R 4.3.0)
#> stringr 1.5.0 2022-12-02 [1] CRAN (R 4.3.1)
#> styler 1.10.2 2023-08-29 [1] CRAN (R 4.3.1)
#> urlchecker 1.0.1 2021-11-30 [1] CRAN (R 4.3.1)
#> usethis 2.2.2 2023-07-06 [1] CRAN (R 4.3.1)
#> vctrs 0.6.3 2023-06-14 [1] CRAN (R 4.3.1)
#> withr 2.5.2 2023-10-30 [1] CRAN (R 4.3.2)
#> xfun 0.40 2023-08-09 [1] CRAN (R 4.3.1)
#> xtable 1.8-4 2019-04-21 [1] CRAN (R 4.3.1)
#> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.3.0)
#>
#> [1] C:/Program Files/R/R-4.3.1/library
#>
#> ──────────────────────────────────────────────────────────────────────────────
Created on 2024-01-02 with reprex v2.0.2
"The free service giveth, the free service taketh." We do not put the zeros in, that may just be what (some ?) Google backends deliver for (some ?) combinations of terms. Hard to say more.
When you try downloading multiple Trends series, Google returns zeros to stop you. I have added a function that checks for these blocks and repeats the download if blocks occur. I tried that for several days, and it helped, but in the end, I still needed to download some series manually.
I am seeing blocks of zeroes returned in the interest_over_time data that don't make sense to me.
R version 4.2.2 (2022-10-31 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows Server x64 (build 17763) gtrendsR 1.5.1
Edit: Was trying to use a bit of shorthand here but as I reread that makes it less clear. I am looking specifically at
interest_over_time
.gtrends("crib", geo = "US")$interest_over_time
returnsIn all, there are 87 straight weeks of zeroes. It is not just this term - it also happened with the search term "espresso". The image shows the zeroes starting 2019-05-05 for the search term "crib". Oddly, espresso is missing the 87 weeks up until 2019-04-28 - the week before 2019-05-05.
gtrends("crib")$interest_over_time
does not have this same issue with these search terms - but it does happen with other search terms.Things I have tried: 1) Confirmed google trends website does not have the issue 2) Python's pytrends - same result! So it doesn't seem specific to this package, but obviously still an issue. 3) Different computer (same result) 4) Called and asked someone I know to try it for me on their computer (same result).
It does not happen with every search term - I'd love to share some kind of pattern, but I'm just not seeing it.
I suspect this issue might not exist tomorrow with these terms but will be found on others, but... hard to say before tomorrow. I looked at some data I've saved from gtrends() calls in the past and this didn't seem to be happening in August but was happening at the beginning of October.