PMassicotte / gtrendsR

R functions to perform and display Google Trends queries
352 stars 112 forks source link

Proxy problems #434

Closed paulcbauer closed 1 year ago

paulcbauer commented 1 year ago

Hi, we are using gtrendsR for a research project and would like to run a few thousand queries within a few hours, as the analysis is very fine-grained. To make this work, we want to use residential proxies of a proxy service (the provider has us whitelisted for research purposes).

Setting the proxy in R via Sys.setenv works well (it rotates through IPs), but when running gtrendsR we get an HTTP error code 407:

“Error in curl::curl_fetch_memory(url, handle = .pkgenv[["cookie_handler"]]) : Received HTTP code 407 from proxy after CONNECT”

We also tried to insert the proxy parameters into the setHandleParameters() function, which didn't work either.

Here's our code:

library(jsonlite)
library(gtrendsR)

# Set rotating proxy
Sys.setenv(http_proxy = http://user-username:npassword@domain:port/, 
https_proxy = " http://user-username:npassword@domain:port/")

# Test if it works: IP should change every time fromJSON() is called
fromJSON(https://api.myip.com/) # works because it returns a new IP on every call

# does not work, returns HTTP error code 407

 gtrends(keyword= "Merkel",
                 geo= "DE",
                 category = 19,
                 time = "2020-09-10 2020-09-17",
                 gprop="web",
                 onlyInterest = TRUE)$interest_over_time

# The above does not work either when we set the parameters in the following function
setHandleParameters(user = "XXXXXXXXXX",
                     password = "XXXXXXXXXX",
                     domain = "XXXXXXXX",
                     proxyhost = "XXXXX")

Any help would be greatly appreciated. Thanks!

tom-parkinson commented 8 months ago

Hi @paulcbauer I was wondering if you got this to work? I'm in the exact same position re research project and was hoping to solve the 407 issues

paulcbauer commented 8 months ago

Hi @tom-parkinson, in the end we didn't and then spread out the calls over time. Not ideal though. You could try out different commercial proxies (e.g., smartproxy, rayobyte, etc.).