jeroen / curl

A Modern and Flexible Web Client for R
https://jeroen.r-universe.dev/curl
Other
218 stars 70 forks source link

Error in curl::curl_fetch_disk(url, x$path, handle = handle) : Timeout was reached #72

Closed luiandresgonzalez closed 8 years ago

luiandresgonzalez commented 8 years ago

I'm in Ubuntu 16.04, running R

platform       x86_64-pc-linux-gnu         
arch           x86_64                      
os             linux-gnu                   
system         x86_64, linux-gnu           
status                                     
major          3                           
minor          2.3                         
year           2015                        
month          12                          
day            10                          
svn rev        69752                       
language       R                           
version.string R version 3.2.3 (2015-12-10)
nickname       Wooden Christmas-Tree 

All packages and dependencies are up to date (includes devtools and Rcurl, curl, httr). When trying to use install_github to install any package, I get the following:

Error in curl::curl_fetch_memory(url, handle = handle) : 
  Timeout was reached

traceback() 12: .Call(R_curl_fetch_disk, url, handle, path, "wb", nonblocking) 11: curl::curl_fetch_disk(url, x$path, handle = handle) 10: request_fetch.write_disk(req$output, req$url, handle) 9: request_fetch(req$output, req$url, handle) 8: request_perform(req, hu$handle$handle) 7: httr::GET(url, path = path, httr::write_disk(path = tmp)) 6: remote_package_name.github_remote(remote) 5: remote_package_name(remote) 4: FUN(X[[i]], ...) 3: vapply(remotes, install_remote, ..., FUN.VALUE = logical(1)) 2: install_remotes(remotes, quiet = quiet, ...) 1: install_github("StatsWithR/statsr")

sainathadapa commented 8 years ago

I am facing the same issue as well. Using Ubuntu 16.04.

> library(devtools)
> library(rvest)
Loading required package: xml2
> read_html('https://github.com/hadley/devtools/issues/877')
Error in open.connection(x, "rb") : Timeout was reached
> session_info()
Session info ---------------------------------------------------------------------------------------------------
 setting  value                       
 version  R version 3.3.1 (2016-06-21)
 system   x86_64, linux-gnu           
 ui       RStudio (99.9.9)            
 language en_IN:en                    
 collate  en_US.UTF-8                 
 tz       Asia/Calcutta               
 date     2016-08-13                  

Packages -------------------------------------------------------------------------------------------------------
 package  * version date       source        
 curl       1.1     2016-07-26 CRAN (R 3.3.1)
 devtools * 1.12.0  2016-06-24 CRAN (R 3.3.1)
 digest     0.6.10  2016-08-02 CRAN (R 3.3.1)
 httr       1.2.1   2016-07-03 CRAN (R 3.3.1)
 magrittr   1.5     2014-11-22 CRAN (R 3.3.0)
 memoise    1.0.0   2016-01-29 CRAN (R 3.3.0)
 R6         2.1.2   2016-01-26 CRAN (R 3.3.0)
 Rcpp       0.12.6  2016-07-19 CRAN (R 3.3.1)
 rvest    * 0.3.2   2016-06-17 CRAN (R 3.3.1)
 withr      1.0.2   2016-06-20 CRAN (R 3.3.0)
 xml2     * 1.0.0   2016-06-24 CRAN (R 3.3.1)
> curl_version()
$version
[1] "7.47.0"

$ssl_version
[1] "OpenSSL/1.0.2g"

$libz_version
[1] "1.2.8"

$libssh_version
[1] NA

$libidn_version
[1] "1.32"

$host
[1] "x86_64-pc-linux-gnu"

$protocols
 [1] "dict"   "file"   "ftp"    "ftps"   "gopher" "http"   "https"  "imap"   "imaps"  "ldap"   "ldaps"  "pop3"  
[13] "pop3s"  "rtmp"   "rtsp"   "smb"    "smbs"   "smtp"   "smtps"  "telnet" "tftp"  

$ipv6
[1] TRUE

$http2
[1] FALSE
sainathadapa commented 8 years ago

@luiandresgonzalez Were you able to fix the issue, or come up with a workaround?

jeroen commented 8 years ago

I just tried this on an Ubuntu 16.04 machine and cannot reproduce any problem. Are you sure there wasn't some temporary network problem on your machine that caused the request to fail?

Note that requests fail on a regular basis on wifi or so, but in your browser you might not notice because it just tries again.

sainathadapa commented 8 years ago

I have been unable to use any function which uses curl, since at least a week. I don't think it is a connection issue, I have a desktop and the connection is via ethernet. Also, I tried using a tethering internet connection via mobile, and it hasn't worked as well. May be, something I installed or some system updates, are causing this.

Can you give me some pointers on how to debug this problem, or link to few resources which can help solve this problem, thanks.

jeroen commented 8 years ago

Can you show me output of:

curl_fetch_memory("https://httpbin.org/get", new_handle(verbose = TRUE))
sainathadapa commented 8 years ago
> library(curl)
> curl_fetch_memory("https://httpbin.org/get", new_handle(verbose = TRUE))
* Resolving timed out after 10000 milliseconds
* Closing connection 0
Error in curl_fetch_memory("https://httpbin.org/get", new_handle(verbose = TRUE)) : 
  Timeout was reached
jeroen commented 8 years ago

Wow. Can you try:

curl::nslookup("httpbin.org")
utils::nsl("httpbin.org")
sainathadapa commented 8 years ago
> curl::nslookup("httpbin.org")
[1] "54.175.219.8"
> utils::nsl("httpbin.org")
[1] "23.22.14.18"

curl::nslookup took some time to bring up the result. utils::nsl response was instant

jeroen commented 8 years ago

And can you try in R:

system("ping -c5 httpbin.org")
sainathadapa commented 8 years ago
> system("ping -c5 httpbin.org")
PING httpbin.org (54.175.219.8) 56(84) bytes of data.
64 bytes from ec2-54-175-219-8.compute-1.amazonaws.com (54.175.219.8): icmp_seq=1 ttl=42 time=233 ms
64 bytes from ec2-54-175-219-8.compute-1.amazonaws.com (54.175.219.8): icmp_seq=2 ttl=42 time=232 ms
64 bytes from ec2-54-175-219-8.compute-1.amazonaws.com (54.175.219.8): icmp_seq=3 ttl=42 time=232 ms
64 bytes from ec2-54-175-219-8.compute-1.amazonaws.com (54.175.219.8): icmp_seq=4 ttl=42 time=231 ms
64 bytes from ec2-54-175-219-8.compute-1.amazonaws.com (54.175.219.8): icmp_seq=5 ttl=42 time=230 ms

--- httpbin.org ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4004ms
rtt min/avg/max/mdev = 230.975/232.256/233.394/0.923 ms
jeroen commented 8 years ago

And if you run (outside of R) in the terminal:

ping -c5 httpbin.org
sainathadapa commented 8 years ago
sainath@sai-desk ~> ping -c5 httpbin.org
PING httpbin.org (23.22.14.18) 56(84) bytes of data.
64 bytes from ec2-23-22-14-18.compute-1.amazonaws.com (23.22.14.18): icmp_seq=1 ttl=42 time=329 ms
64 bytes from ec2-23-22-14-18.compute-1.amazonaws.com (23.22.14.18): icmp_seq=2 ttl=42 time=254 ms
64 bytes from ec2-23-22-14-18.compute-1.amazonaws.com (23.22.14.18): icmp_seq=3 ttl=42 time=259 ms
64 bytes from ec2-23-22-14-18.compute-1.amazonaws.com (23.22.14.18): icmp_seq=4 ttl=42 time=251 ms
64 bytes from ec2-23-22-14-18.compute-1.amazonaws.com (23.22.14.18): icmp_seq=5 ttl=42 time=238 ms

--- httpbin.org ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4000ms
rtt min/avg/max/mdev = 238.385/266.605/329.599/32.256 ms
jeroen commented 8 years ago

Can you also try:

curl_fetch_memory("http://54.175.219.8/get", new_handle(verbose = TRUE))
curl_fetch_memory("http://23.22.14.18/get", new_handle(verbose = TRUE))
sainathadapa commented 8 years ago
> library(curl)
> curl_fetch_memory("http://54.175.219.8/get", new_handle(verbose = TRUE))
*   Trying 54.175.219.8...
* Connected to 54.175.219.8 (54.175.219.8) port 80 (#0)
> GET /get HTTP/1.1
Host: 54.175.219.8
User-Agent: r/curl/jeroen
Accept: */*
Accept-Encoding: gzip, deflate

< HTTP/1.1 200 OK
< Server: nginx
< Date: Sat, 13 Aug 2016 11:03:39 GMT
< Content-Type: application/json
< Content-Length: 233
< Connection: keep-alive
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Credentials: true
< 
* Connection #0 to host 54.175.219.8 left intact
$url
[1] "http://54.175.219.8/get"

$status_code
[1] 200

$headers
  [1] 48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b 0d 0a 53 65 72 76 65 72 3a 20
 [26] 6e 67 69 6e 78 0d 0a 44 61 74 65 3a 20 53 61 74 2c 20 31 33 20 41 75 67 20
 [51] 32 30 31 36 20 31 31 3a 30 33 3a 33 39 20 47 4d 54 0d 0a 43 6f 6e 74 65 6e
 [76] 74 2d 54 79 70 65 3a 20 61 70 70 6c 69 63 61 74 69 6f 6e 2f 6a 73 6f 6e 0d
[101] 0a 43 6f 6e 74 65 6e 74 2d 4c 65 6e 67 74 68 3a 20 32 33 33 0d 0a 43 6f 6e
[126] 6e 65 63 74 69 6f 6e 3a 20 6b 65 65 70 2d 61 6c 69 76 65 0d 0a 41 63 63 65
[151] 73 73 2d 43 6f 6e 74 72 6f 6c 2d 41 6c 6c 6f 77 2d 4f 72 69 67 69 6e 3a 20
[176] 2a 0d 0a 41 63 63 65 73 73 2d 43 6f 6e 74 72 6f 6c 2d 41 6c 6c 6f 77 2d 43
[201] 72 65 64 65 6e 74 69 61 6c 73 3a 20 74 72 75 65 0d 0a 0d 0a

$modified
[1] NA

$times
     redirect    namelookup       connect   pretransfer starttransfer 
     0.000000      0.000036      0.221237      0.221310      0.448007 
        total 
     0.448161 

$content
  [1] 7b 0a 20 20 22 61 72 67 73 22 3a 20 7b 7d 2c 20 0a 20 20 22 68 65 61 64 65
 [26] 72 73 22 3a 20 7b 0a 20 20 20 20 22 41 63 63 65 70 74 22 3a 20 22 2a 2f 2a
 [51] 22 2c 20 0a 20 20 20 20 22 41 63 63 65 70 74 2d 45 6e 63 6f 64 69 6e 67 22
 [76] 3a 20 22 67 7a 69 70 2c 20 64 65 66 6c 61 74 65 22 2c 20 0a 20 20 20 20 22
[101] 48 6f 73 74 22 3a 20 22 35 34 2e 31 37 35 2e 32 31 39 2e 38 22 2c 20 0a 20
[126] 20 20 20 22 55 73 65 72 2d 41 67 65 6e 74 22 3a 20 22 72 2f 63 75 72 6c 2f
[151] 6a 65 72 6f 65 6e 22 0a 20 20 7d 2c 20 0a 20 20 22 6f 72 69 67 69 6e 22 3a
[176] 20 22 31 38 30 2e 31 35 31 2e 32 31 31 2e 31 33 36 22 2c 20 0a 20 20 22 75
[201] 72 6c 22 3a 20 22 68 74 74 70 3a 2f 2f 35 34 2e 31 37 35 2e 32 31 39 2e 38
[226] 2f 67 65 74 22 0a 7d 0a

> curl_fetch_memory("http://23.22.14.18/get", new_handle(verbose = TRUE))
*   Trying 23.22.14.18...
* Connected to 23.22.14.18 (23.22.14.18) port 80 (#0)
> GET /get HTTP/1.1
Host: 23.22.14.18
User-Agent: r/curl/jeroen
Accept: */*
Accept-Encoding: gzip, deflate

< HTTP/1.1 200 OK
< Server: nginx
< Date: Sat, 13 Aug 2016 11:03:42 GMT
< Content-Type: application/json
< Content-Length: 231
< Connection: keep-alive
< Access-Control-Allow-Origin: *
< Access-Control-Allow-Credentials: true
< 
* Connection #0 to host 23.22.14.18 left intact
$url
[1] "http://23.22.14.18/get"

$status_code
[1] 200

$headers
  [1] 48 54 54 50 2f 31 2e 31 20 32 30 30 20 4f 4b 0d 0a 53 65 72 76 65 72 3a 20
 [26] 6e 67 69 6e 78 0d 0a 44 61 74 65 3a 20 53 61 74 2c 20 31 33 20 41 75 67 20
 [51] 32 30 31 36 20 31 31 3a 30 33 3a 34 32 20 47 4d 54 0d 0a 43 6f 6e 74 65 6e
 [76] 74 2d 54 79 70 65 3a 20 61 70 70 6c 69 63 61 74 69 6f 6e 2f 6a 73 6f 6e 0d
[101] 0a 43 6f 6e 74 65 6e 74 2d 4c 65 6e 67 74 68 3a 20 32 33 31 0d 0a 43 6f 6e
[126] 6e 65 63 74 69 6f 6e 3a 20 6b 65 65 70 2d 61 6c 69 76 65 0d 0a 41 63 63 65
[151] 73 73 2d 43 6f 6e 74 72 6f 6c 2d 41 6c 6c 6f 77 2d 4f 72 69 67 69 6e 3a 20
[176] 2a 0d 0a 41 63 63 65 73 73 2d 43 6f 6e 74 72 6f 6c 2d 41 6c 6c 6f 77 2d 43
[201] 72 65 64 65 6e 74 69 61 6c 73 3a 20 74 72 75 65 0d 0a 0d 0a

$modified
[1] NA

$times
     redirect    namelookup       connect   pretransfer starttransfer 
     0.000000      0.000028      0.236157      0.236218      0.478174 
        total 
     0.478299 

$content
  [1] 7b 0a 20 20 22 61 72 67 73 22 3a 20 7b 7d 2c 20 0a 20 20 22 68 65 61 64 65
 [26] 72 73 22 3a 20 7b 0a 20 20 20 20 22 41 63 63 65 70 74 22 3a 20 22 2a 2f 2a
 [51] 22 2c 20 0a 20 20 20 20 22 41 63 63 65 70 74 2d 45 6e 63 6f 64 69 6e 67 22
 [76] 3a 20 22 67 7a 69 70 2c 20 64 65 66 6c 61 74 65 22 2c 20 0a 20 20 20 20 22
[101] 48 6f 73 74 22 3a 20 22 32 33 2e 32 32 2e 31 34 2e 31 38 22 2c 20 0a 20 20
[126] 20 20 22 55 73 65 72 2d 41 67 65 6e 74 22 3a 20 22 72 2f 63 75 72 6c 2f 6a
[151] 65 72 6f 65 6e 22 0a 20 20 7d 2c 20 0a 20 20 22 6f 72 69 67 69 6e 22 3a 20
[176] 22 31 38 30 2e 31 35 31 2e 32 31 31 2e 31 33 36 22 2c 20 0a 20 20 22 75 72
[201] 6c 22 3a 20 22 68 74 74 70 3a 2f 2f 32 33 2e 32 32 2e 31 34 2e 31 38 2f 67
[226] 65 74 22 0a 7d 0a
jeroen commented 8 years ago

Hmmm so there must be something wrong with your DNS settings. To confirm, the first one fails, but the second one succeeds?

req <- curl_fetch_memory("http://httpbin.org/get")
req <-  curl_fetch_memory(paste0("http://", nslookup("httpbin.org"), "/get"))
sainathadapa commented 8 years ago
> req <- curl_fetch_memory("http://httpbin.org/get")
Error in curl_fetch_memory("http://httpbin.org/get") : 
  Timeout was reached
> req <-  curl_fetch_memory(paste0("http://", nslookup("httpbin.org"), "/get"))
> 

So, should I use Google DNS?

jeroen commented 8 years ago

Does the same problem appear when you use R's downloaders?

readLines("https://httpbin.org/get")

Or when you use the curl command from the terminal:

curl https://httpbin.org/get
sainathadapa commented 8 years ago
> readLines("https://httpbin.org/get")
 [1] "{"                                                                     
 [2] "  \"args\": {}, "                                                      
 [3] "  \"headers\": {"                                                      
 [4] "    \"Accept\": \"*/*\", "                                             
 [5] "    \"Host\": \"httpbin.org\", "                                       
 [6] "    \"User-Agent\": \"R (3.3.1 x86_64-pc-linux-gnu x86_64 linux-gnu)\""
 [7] "  }, "                                                                 
 [8] "  \"origin\": \"180.151.211.136\", "                                   
 [9] "  \"url\": \"https://httpbin.org/get\""                                
[10] "}" 

I tried download.file as well and it worked

> download.file("https://httpbin.org/get", destfile = tempfile())
trying URL 'https://httpbin.org/get'
Content type 'application/json' length 224 bytes
==================================================
downloaded 224 bytes
jeroen commented 8 years ago

This is so strange. Can you try with the RCurl package:

RCurl::getURL("https://httpbin.org/get")
sainathadapa commented 8 years ago
> RCurl::getURL("https://httpbin.org/get")
[1] "{\n  \"args\": {}, \n  \"headers\": {\n    \"Accept\": \"*/*\", \n    \"Host\": \"httpbin.org\"\n  }, \n  \"origin\": \"180.151.211.136\", \n  \"url\": \"https://httpbin.org/get\"\n}\n"
> 
jeroen commented 8 years ago

Can you try if this makes a difference:

req <- curl_fetch_memory("https://httpbin.org/get")
options(curl_interrupt = FALSE)
req <- curl_fetch_memory("https://httpbin.org/get")
sainathadapa commented 8 years ago

Didn't work.

> req <- curl_fetch_memory("https://httpbin.org/get")
Error in curl_fetch_memory("https://httpbin.org/get") : 
  Timeout was reached
> options(curl_interrupt = FALSE)
> req <- curl_fetch_memory("https://httpbin.org/get")
Error in curl_fetch_memory("https://httpbin.org/get") : 
  Timeout was reached
> 
jeroen commented 8 years ago

OK this is hard to debug. Can I get a shell a guest account your machine or so?

jeroen commented 8 years ago

It seems your DNS server is really slow. It takes (almost exactly) 15 seconds to resolve a name:

$times
     redirect    namelookup       connect   pretransfer starttransfer
      0.00000      15.00657      15.22732      15.70043      15.92480

It times out in curl because the default connect timeout has been set to 10 seconds. You can override the timeout using by setting CONNECTTIMEOUT:

curl_fetch_memory("https://httpbin.org/get", handle = new_handle(CONNECTTIMEOUT = 60))

The same problem appears in the terminal so it is unrelated to R:

curl -w "time_namelookup:  %{time_namelookup}\n" https://httpbin.org/get

But the real question is of course why your dns dns server is so slow. It looks like some misconfiguration where it first tries some DNS server and after 15 seconds gives up and then tries another one or so, because it takes exactly 15sec, each time.

sainathadapa commented 8 years ago

Ok. If DNS was the issue, then using the mobile internet connection should have solved the problem right? May be something related to DNS settings in my system is the problem. I will search for ways to reset the dns settings or something of that sort. Do you think, changing to DNS to google's will work? (8.8.8.8)

Is there a option for the CONNECTTIMEOUT so that i can put it in .Rprofile? like options(CONNECTTIMEOUT=60))

Thanks for all the help.

Thanks, Sainath Adapa.

On 13 August 2016 at 18:37, Jeroen Ooms notifications@github.com wrote:

It seems your DNS server is really slow. It takes (almost exactly) 15 seconds to resolve a name:

$times redirect namelookup connect pretransfer starttransfer 0.00000 15.00657 15.22732 15.70043 15.92480

It times out in curl because the default connect timeout has been set to 10 seconds https://github.com/cran/curl/blob/1.2/src/handle.c#L75. You can override the timeout using by setting CONNECTTIMEOUT https://curl.haxx.se/libcurl/c/CURLOPT_CONNECTTIMEOUT.html:

curl_fetch_memory("https://httpbin.org/get", handle = new_handle(CONNECTTIMEOUT = 60))

The same problem appears in the terminal so it is unrelated to R:

curl -w "time_namelookup: %{time_namelookup}\n" https://httpbin.org/get

But the real question is of course why your dns dns server is so slow. It looks like some misconfiguration where it first tries some DNS server and after 15 seconds gives up and then tries another one or so, because it takes exactly 15sec, each time.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/jeroenooms/curl/issues/72#issuecomment-239619959, or mute the thread https://github.com/notifications/unsubscribe-auth/AEsCJ9k2R65z_OoT940cHBpNTb9WvQ4iks5qfcF1gaJpZM4JY3ze .

jeroen commented 8 years ago

I'm not sure, if you have configured a dead DNS in your system it might time out on that for any connection. Maybe read some of this: http://unix.stackexchange.com/questions/141163/dns-lookups-sometimes-take-5-seconds

jeroen commented 8 years ago

I think you have a dead DNS server configured inside /etc/resolvconf/resolv.conf.d. Can you comment out the file called original and reboot?

sainathadapa commented 8 years ago

I can't find the original file. Here is the directory structure:

sainath@sai-desk:/etc/resolvconf$ tree
.
├── interface-order
├── resolv.conf.d
│   ├── base
│   └── head
├── update.d
│   └── libc
└── update-libc.d
    └── avahi-daemon

3 directories, 5 files
jeroen commented 8 years ago

Ow sorry maybe that was on the host machine, I must have gotten confused.

I recommend switching to google DNS or OpenDNS servers and see if that resolves the issue.

sainathadapa commented 8 years ago

Switching to Google DNS solved the issue. Thanks!

luiandresgonzalez commented 8 years ago

Hi, sorry for the late answer. Can confirm that the problem is related to a bad DNS. After switching to 8.8.8.8 it solved the problem.

tarsileshi commented 8 years ago

could you helpme i got the following problem when i was installing install_github('dutri001/bfastSpatial') Error in curl::curl_fetch_disk(url, x$path, handle = handle) : Timeout was reached

tarsileshi commented 8 years ago

hello my dear i face the following problem when i install bfastSpatial in R

library(devtools) install_github('dutri001/bfastSpatial') Error in curl::curl_fetch_disk(url, x$path, handle = handle) : Timeout was reached

deepankarSrigyan commented 7 years ago

what i realize, this happen when your network don't allow you to download everything from internet, mostly when you are using network for an organization. Try to do it over internet which is not organization specific. May be in your home it will work for you. For me, It did.

amirhmstu commented 6 years ago

how image in R markdown ?????

amirhmstu@gmail.com

kaybeekim commented 1 year ago

In my case this problem occurs when I use up all EBS storage in aws ec2 instance. After removing some files, error has gone.