First...impressive work here, good job. I use the httr and curl packages to pull data from Smartsheet. Where I work they use a proxy server, so I have to use 'use_proxy' from the curl package and then feed that into my 'GET' statement from httr. The issue is that I want to scrape a web page with various links and cannot seem to find a way to incorporate the proxy into the 'Rcrawler' function. I tried without success using "website = paste("www.website.com", config_proxy), no_cores = 4, no_conn = 4)" where I set up my proxy to be named config_proxy with the curl package 'use_proxy' function. Is there a specific way I can pass the proxy information to the Rcrawler function? I tried your example with various alterations on incorporating the config_proxy variable however no success.
First...impressive work here, good job. I use the httr and curl packages to pull data from Smartsheet. Where I work they use a proxy server, so I have to use 'use_proxy' from the curl package and then feed that into my 'GET' statement from httr. The issue is that I want to scrape a web page with various links and cannot seem to find a way to incorporate the proxy into the 'Rcrawler' function. I tried without success using "website = paste("www.website.com", config_proxy), no_cores = 4, no_conn = 4)" where I set up my proxy to be named config_proxy with the curl package 'use_proxy' function. Is there a specific way I can pass the proxy information to the Rcrawler function? I tried your example with various alterations on incorporating the config_proxy variable however no success.