r-lib / httr

httr: a friendly http package for R
https://httr.r-lib.org
Other
986 stars 1.99k forks source link

httr & use_proxy vs RCurl behaviour difference #737

Closed yogesh-bansal closed 1 year ago

yogesh-bansal commented 1 year ago

I am trying to use a proxy server using the use_proxy function in a GET call, The proxy server is setup rotate IPs with each call but IPs are sticking with GET & use_proxy while it is working fine with RCurl and system curl commands. I am unable to make sense of why the two are behaving differently.

> ## usname <- "username"
> ## uspwd <- "password"
> 
> ## Ip Sticking
> library(httr)
> library(jsonlite)
> for(idx in 1:3)
+ {
+   message(fromJSON(content(GET("https://ipinfo.io",
+       use_proxy(
+           url = "gate.smartproxy.com",
+           port = 7000,
+           username = usname,
+           password = uspwd
+       )),"text"))$ip)
+ }
120.188.79.85
120.188.79.85
120.188.79.85
> 
> 
> 
> ## Ip Rotating
> library(RCurl)
> opts <- list(
+   proxy         = "gate.smartproxy.com",
+   proxyusername = usname, 
+   proxypassword = uspwd, 
+   proxyport     = 7000
+ )
> options(RCurlOptions = opts)
> for(idx in 1:3) message(fromJSON(getURL("https://ipinfo.io"))$ip)
42.118.70.31
106.194.152.66
180.183.135.201
yogesh-bansal commented 1 year ago

The solution was to use the parameter config(forbid_reuse = TRUE) in the GET command so the code was supposed to be, This would prevent from httr to reuse connections and thus getting a new IP on each call

library(httr)
library(jsonlite)
for(idx in 1:3)
{
    message(fromJSON(content(GET("https://ipinfo.io",
    use_proxy(
        url = "gate.smartproxy.com",
        port = 7000,
        username = usname,
        password = uspwd
        ),config(forbid_reuse = TRUE)),"text"))$ip)
}