PMassicotte / gtrendsR

R functions to perform and display Google Trends queries
353 stars 112 forks source link

res$status_code == 200 is not True (getting a 401 status on URL) #252

Closed Horizon-cmchugh closed 6 years ago

Horizon-cmchugh commented 6 years ago

Hi there,

On making a simple query, I just recently started getting the error:

res$status_code == 200 is not True

My other team members are as well. I debugged into the code to see what the issue is and the URL appears to be returning a 401 error, which is the root of the issue.

Is this happening on everyone's machine? Trying to figure out what could be causing this. Thanks.

Note, I did switch to the github development version from the CRAN version. If I call this: gtrends(keyword = "Superbowl", geo = "US", time = "2017-12-10 2018-01-20")

It ends up generating the following URL before erroring with the above error:

https://www.google.com/trends/api/widgetdata/relatedsearches/csv?req={\"restriction\":{\"geo\":{\"country\":\"US\"},\"time\":\"2017-12-01 2018-01-20\",\"complexKeywordsRestriction\":{\"keyword\":[{\"type\":\"BROAD\",\"value\":\"Superbowl\"}]}},\"keywordType\":\"ENTITY\",\"metric\":[\"TOP\",\"RISING\"],\"trendinessSettings\":{\"compareTime\":\"2017-10-11 2017-11-30\"},\"requestOptions\":{\"property\":\"\",\"backend\":\"IZG\",\"category\":0},\"language\":\"en\"}&token=APP6_UEAAAAAWmoNnGawUuoxeD_ROH3aVswwlfI1i9Rv&tz=300&hl=en-US

It's specifically the related topics portion that is failing.

PMassicotte commented 6 years ago

Try the dev version

plot(gtrendsR::gtrends(keyword = "Superbowl", geo = "US", time = "2017-12-10 2018-01-20"))

Horizon-cmchugh commented 6 years ago

Hmm, odd that it works for you, but not me. I am using the developer version:

> sessionInfo()
R version 3.4.2 (2017-09-28)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] tools     stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] gtrendsR_1.4.1      stringr_1.2.0       nlme_3.1-131        Kmisc_0.5.0         data.table_1.10.4-3 retimes_0.1-2       dplyr_0.7.4        
 [8] NCmisc_1.1.5        openxlsx_4.0.17     lubridate_1.7.1     chron_2.3-51        reshape2_1.4.2      plyr_1.8.4          RODBC_1.3-15       

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.14       compiler_3.4.2     git2r_0.19.0       bindr_0.1          proftools_0.99-2   digest_0.6.12      jsonlite_1.5      
 [8] anytime_0.3.0      gtable_0.2.0       memoise_1.1.0      tibble_1.3.4       lattice_0.20-35    pkgconfig_2.0.1    rlang_0.1.4       
[15] curl_3.1           bindrcpp_0.2       withr_2.1.0        httr_1.3.1         knitr_1.17         devtools_1.13.4    grid_3.4.2        
[22] glue_1.2.0         R6_2.2.2           ggplot2_2.2.1      magrittr_1.5       scales_0.5.0       assertthat_0.2.0   RApiDatetime_0.0.3
[29] colorspace_1.3-2   stringi_1.1.6      lazyeval_0.2.1     munsell_0.4.3      markdown_0.8    

What's interesting is that it's the related_topics function that is failing for me. interest_over_time and interest_by_region return as normal. Specifically this line in the gtrends function:

  related_topics <- related_topics(widget, comparison_item, 
    hl)

Which later calls create_related_topics_payload, with the eventual error caused by this:

Browse[5]> res
$url
[1] "https://trends.google.com/trends/api/widgetdata/relatedsearches/csv?req=%7B%22restriction%22:%7B%22geo%22:%7B%22country%22:%22US%22%7D,%22time%22:%222017-12-10%202018-01-20%22,%22complexKeywordsRestriction%22:%7B%22keyword%22:[%7B%22type%22:%22BROAD%22,%22value%22:%22Superbowl%22%7D]%7D%7D,%22keywordType%22:%22ENTITY%22,%22metric%22:[%22TOP%22,%22RISING%22],%22trendinessSettings%22:%7B%22compareTime%22:%222017-10-29%202017-12-09%22%7D,%22requestOptions%22:%7B%22property%22:%22%22,%22backend%22:%22IZG%22,%22category%22:0%7D,%22language%22:%22en%22%7D&token=APP6_UEAAAAAWmorduPXVYfuPRMiqtcXRHcGfWcoYmeh&tz=300&hl=en-US"

$status_code
[1] 401

Later failing cause of this: stopifnot(res$status_code == 200)

I'll try this on a network outside of my current location, possible that there's some kind of firewall or something causing the url call to fail. Thanks for running.

PMassicotte commented 6 years ago

Ah sorry install the following branch. https://github.com/PMassicotte/gtrendsR/tree/low-search-volume

JoeOD commented 6 years ago

I installed from the suggested branch using the devtools::install_github("PMassicotte/gtrendsR") command, and I'm having the same issue as Horizon

Horizon-cmchugh commented 6 years ago

Hmm, I installed the branch with this line:

> devtools::install_github("PMassicotte/gtrendsR@low-search-volume")

But still getting the same error. Let me know if this is the incorrect way to install the branch.

Git is also saying that branch is identical to the master development version.

PMassicotte commented 6 years ago

Humm. I do not know what's going on. I will try to look at it.

Horizon-cmchugh commented 6 years ago

Appreciate it, will check again on my home setup to see if I get the same issue. Odd that it works on yours but not ours, but makes me feel less crazy that Joe is having the same issue.

chicofish commented 6 years ago

Same. Old code no longer works.

trendterms2 <- gtrends(c("cancer survivorship", "surviving cancer", "beating cancer", "living with cancer"))

I hope this helps:

sessionInfo() +++++++ R version 3.4.3 (2017-11-30) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] xmltools_1.0 searchConsoleR_0.3.0 googleAnalyticsR_0.4.2 googleAuthR_0.6.2
[5] ggplot2_2.2.1 dplyr_0.7.4 plyr_1.8.4 gtrendsR_1.9.9.0
[9] urltools_1.6.0 rvest_0.3.2 stringr_1.2.0 xml2_1.1.1
[13] XML_3.98-1.9 httr_1.3.1 jsonlite_1.5

loaded via a namespace (and not attached): [1] Rcpp_0.12.14 compiler_3.4.3 pillar_1.0.1 bindr_0.1
[5] tools_3.4.3 digest_0.6.12 anytime_0.3.0 memoise_1.1.0
[9] tibble_1.4.1 gtable_0.2.0 pkgconfig_2.0.1 rlang_0.1.6
[13] rstudioapi_0.7 curl_3.1 bindrcpp_0.2 triebeard_0.3.0
[17] grid_3.4.3 data.table_1.10.4-3 glue_1.2.0 R6_2.2.2
[21] selectr_0.3-1 tidyr_0.7.2 purrr_0.2.4 magrittr_1.5
[25] scales_0.4.1 assertthat_0.2.0 RApiDatetime_0.0.3 colorspace_1.3-2
[29] labeling_0.3 stringi_1.1.6 lazyeval_0.2.1 munsell_0.4.3

Christianmontes commented 6 years ago

I am also having the same problem, started yesterday. I've changed IP since I thought it was a quota limit problem, but it is not. I traced the gtrends function and temporarily removed the "related_topics and "related_queries" functions. This allowed the rest of the function to work again, so I can confirm what Horizon says, the problem seems to lie in these two functions.

drbanderson commented 6 years ago

I'm also having the same problem. My colleague and I did notice a change in the data we were getting from the query on Monday. On Monday the package was returning data different from what we could pull directly from the Google Trends website. After investigating, we found that searching for a phrase (e.g., How to tie a tie using the package now required the phrase to be enclosed in quotation marks, to return the same set of results as that from the website, when also enclosing that phrase in quotes on the site. Now we are seeing the same error as the rest:

Error: res$status_code == 200 is not TRUE

Perhaps Google has made a change to Trends?

diplodata commented 6 years ago

Could this be related to comparedgeo argument? I note that single country call:

https://trends.google.com/trends/api/widgetdata/comparedgeo?hl=en-GB&tz=0&req={"geo":{"country":"US"},"comparisonItem":[{"time":"2017-01-25+2018-01-25","complexKeywordsRestriction":{"keyword":[{"type":"BROAD","value":"trump"}]}}],"resolution":"REGION","locale":"en-GB","requestOptions":{"property":"","backend":"IZG","category":0}}&token=APP6_UEAAAAAWmsmNu_G4FYQMy0gpmwHjGF4PIkLPydA

Fails with 400 if you try and query 2 countries:

https://trends.google.com/trends/api/widgetdata/comparedgeo?hl=en-GB&tz=0&req={"geo":{"country":["US","GB"]},"comparisonItem":[{"time":"2017-01-25+2018-01-25","complexKeywordsRestriction":{"keyword":[{"type":"BROAD","value":"trump"}]}}],"resolution":"REGION","locale":"en-GB","requestOptions":{"property":"","backend":"IZG","category":0}}&token=APP6_UEAAAAAWmsmNu_G4FYQMy0gpmwHjGF4PIkLPydA
cspenn commented 6 years ago

+1 on this issue; receiving these errors more frequently lately as well. Started up probably a week ago.

PMassicotte commented 6 years ago

The trend data works fine. The error is from create_related_topics_payload(). Google must have changed something lately. I do not have time to investigate for the next few days. So if anyone can do it...

diplodata commented 6 years ago

I'm working on this. Some reformatting of url params in create_related_queries_payload required. But final problem seems to be tokens. It would be useful to have an idea how the token system works. Any steer @PMassicotte?

phnparker commented 6 years ago

Same issue.

Thanks to any/all who have the time to figure it out.

unitroot commented 6 years ago

Forked the corp proxy fork, to get a hotfix for the time series at least. Not a big thing, but alows me to stay afloat until @PMassicotte has time for a decent fix. unitroot/gtrendsR

Christianmontes commented 6 years ago

@unitroot, how can we install your version? I currently use the trace function and remove the "related_topics and "related_queries" functions. I am relatively new to R, (coming from matlab), so I would love to have a hotfix that allows me to download the time series while the package is fixed.

diplodata commented 6 years ago

I've also made a temp patch removing related topics/queries - devtools::install_github('diplodata/gtrendsR')

Chyvan commented 6 years ago

I have the same issue since Wednesday. At first, I thought I reached the quota limit but since it still doesn't work I think it must be something different. In the browser, I can easily access the Google Trends data but not with R. Edit: works again with the version from @diplodata , thx!

ghost commented 6 years ago

Same issue as described above. Is there a way to still download google trend data from R. DO we have to remove the function quoted above by hand ? Thank you for your help.

Chyvan commented 6 years ago

@ThomasAmetNike Have you tried using the version from @diplodata ?

ghost commented 6 years ago

@Chyvan Yes I did and it works fine ! Is it a hotfix or a final one ? @diplodata Thanks for the fix

cspenn commented 6 years ago

For the temp patch - any insight on the repair to the related trends/topics portion? Thank you!

diplodata commented 6 years ago

Sorry @cspenn, I literally just commented out 4 lines and rebuilt the library.

I might add however that the API URLs that the library calls (e.g. https://www.google.com/trends/api/widgetdata/relatedsearches/csv?...) appear somewhat different to those that the current Google Trends webpage itself calls to the server (https://trends.google.com/trends/api/widgetdata/relatedsearches?...) - which should probably be addressed.

Christianmontes commented 6 years ago

@PMassicotte, I know its not best practice to write in a closed issue, and I assume, since you closed it, you are working on a fix. Can we get any info on to when you think you will have a fix for the related terms/related topics part? thanks a lot!

PMassicotte commented 6 years ago

@Christianmontes do not know whythis was closed

diplodata commented 6 years ago

@Christianmontes I had a crack at fixing this earlier - see https://github.com/diplodata/gtrendsR/tree/debugging - but hit a wall. I fixed (I think) related_queries.R and related_topics.R so the API calls match the website. But they still fail because widget is not returning the right tokens. It definitely fails when you pass multiple geo arguments. But for what it's worth you might find what I've done useful.

projectflutrend commented 6 years ago

Yep, same issue here. The fix from diplodata helped a bit (thank you!), but after about 50 queries I get the 'Error: widget$status_code == 200 is not TRUE" again. And I can't retrieve related queries.

diplodata commented 6 years ago

@projectflutrend Maybe server is identifying non-browsers as the related calls aren't being received? I suggest increasing time between calls, which should be at least 1s in any case.

Christianmontes commented 6 years ago

@diplodata I had a look at your debug. But like you I am not really familiar with Tokens, so I have no idea of how to fix this. @PMassicotte do you have any idea of what changes google made to their API?

diplodata commented 6 years ago

@Christianmontes Sorry I'm in the dark as much as you. I suspect the challenge is to get get_widget() to match what the browser pings to the server at the start of a new query.

PMassicotte commented 6 years ago

@Christianmontes I am traveling for the next 12 days. I will then have time to work on this.

Christianmontes commented 6 years ago

@PMassicotte, Hi Phillip, no problem. We really appreciate the help!!

bharath15081987 commented 6 years ago

Still am getting the same error if (!require("devtools")) install.packages("devtools") devtools::install_github("PMassicotte/gtrendsR") library(gtrendsR) check <- gtrendsR::gtrends("Xbox",geo = "US") # Error: res$status_code == 200 is not TRUE

Can someone help me on this please

phnparker commented 6 years ago

@bharath15081987 That's because the issue hasn't been resolved yet. Read this thread again more closely, that is all discussed already, including a temporary work-around.

bharath15081987 commented 6 years ago

Thanks, @phnparker . somewhere I saw temporary workaround so tried to run from developer version. Will wait for the fix!!!!!!

phnparker commented 6 years ago

@bharath15081987 If you don't need the related payload, try @diplodata version. devtools::install_github('diplodata/gtrendsR')

bharath15081987 commented 6 years ago

no luck, its failing. Thanks for your help

diplodata commented 6 years ago

Just re-tested - works for me.

phnparker commented 6 years ago

Same. Works for me. image

bharath15081987 commented 6 years ago

Used the below script but still gettting error. Don't know what's wrong with me devtools::install_github('diplodata/gtrendsR') library(gtrendsR) library(dbConnect) library(RODBC)

search trem and country name

current.date <- as.character(Sys.Date()) res <- gtrends(c("Surface Pro 4"), geo = c("US"), time = paste("2015-10-01", current.date, sep = " " )) Error: widget$status_code == 200 is not TRUE

phnparker commented 6 years ago

Your code works for me. However, your error is not the error that this thread concerns. Your Error: Error: widget$status_code == 200 is not TRUE This Thread Error: res$status_code == 200 is not TRUE Though they may be related issues stemming from the same changes that google made to the API.

image

bharath15081987 commented 6 years ago

Ya, I have noticed the error, its widget status. Let me try with different IP. Thanks @phnparker

diplodata commented 6 years ago

Do you get any The following object(s) are masked from .. messages when you load the packages up? If so try loading gtrendsR last.

janush1985 commented 6 years ago

hey all,

Had same issue.

Temp fix: devtools::install_github('diplodata/gtrendsR') seems to be working ok.

Thanks, Janusz

cspenn commented 6 years ago

It looks like the folks at PyTrends managed to do the token thing. I've no clue how to transpose python to R, though. Here's the relevant code:

https://github.com/GeneralMills/pytrends/blob/master/pytrends/request.py

diplodata commented 6 years ago

@PMassicotte must know how it works, and should be back in a week or so.

fool65c commented 6 years ago

I looked at the python lib and it looks like this is missing the "originalTimeRangeForExploreUrl" restriction.

https://github.com/kevinmager65/gtrendsR/commit/b4b95775f3622ff81fce995a12f7a6ca5784f918

PMassicotte commented 6 years ago

I should be able to look at it Friday. Stay in touch.

ucb commented 6 years ago

I am using devtools::install_github('diplodata/gtrendsR'), which generally works, but both res <- gtrends("", category = "1267", geo = "NO", time = "all") and res <- gtrends(category = "1267", geo = "NO", time = "all") result in an error. Is it a side effect of diplodata cutting off some things? or is it some other problem?