Closed jhpoelen closed 3 months ago
Thanks much. Haven't thought about this yet, will have a look and hopefully fix soon
woops, accidentally closed
Hmm, I can't reproduce the error using your example; perhaps there is some interaction with network connection speed as well? Still, I think you're correct that it's a bug. @sckott can you reproduce? (I just ran the wget
command on from the ubuntu-based DO box, which obviously has a good network speed).
@jhpoelen is the offset
part of the query relevant to getting this error, or do you get the same error without that? I'm wondering if the API is sending the wrong information about content-length when using offsets perhaps?
@cboettig I could reproduce it earlier today, but after your server fix, i can't reproduce it, so I think it may be fixed
carl, we may want to roll back this commit https://github.com/ropensci/fishbaseapi/commit/131a6b07940ff6cda42dc9a0c1c10772e63b08bd if this is sorted
@sckott yeah, I'm not sure if the server fix actually fixed it, or if it's just a temperamental network issue. CRAN's test machines were getting some 'incomplete file transfer' errors too on a few tests I forgot to skip_on_cran
. let's see if @jhpoelen can reproduce before we roll back.
thanks for the quick reply. I was not able to reproduce the issue just now. Also, close inspection in the firefox "network" develop window now indicates that the content-length is the same as bytes transferred. Whatever changed . . . now the api works like a charm! Closing issue.
PS I doubt this was a network issue - I was able to get the results ok, the content-length header simply didn't reflect the bytes transferred, causing the httpclients to freak out.
@jhpoelen we ended up having a problem on the server where our disk was full. That somehow caused larger requests to not be handled correctly.
Unfortunately, it seems that the issue is still occurring.
To reproduce:
wget "http://fishbase.ropensci.org/species?limit=5000&offset=0"
--2015-09-16 11:14:01-- http://fishbase.ropensci.org/species?limit=5000&offset=0
Resolving fishbase.ropensci.org... 198.199.104.22
Connecting to fishbase.ropensci.org|198.199.104.22|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14377534 (14M) [application/json]
Saving to: `species?limit=5000&offset=0'
45% [===============================> ] 6,576,816 394K/s in 27s
2015-09-16 11:14:37 (235 KB/s) - Connection closed at byte 6576816. Retrying.
--2015-09-16 11:14:38-- (try: 2) http://fishbase.ropensci.org/species?limit=5000&offset=0
Connecting to fishbase.ropensci.org|198.199.104.22|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14377534 (14M) [application/json]
Saving to: `species?limit=5000&offset=0'
9% [======> ] 1,428,432 382K/s eta 39s
@jhpoelen works for me, 3 times in a row, works fine with curl
too
is it repeatedly failing for you?
Weird!
I tried again, and this time the issue occurred a bit later:
wget "http://fishbase.ropensci.org/species?limit=5000&offset=0"
--2015-09-16 11:25:11-- http://fishbase.ropensci.org/species?limit=5000&offset=0
Resolving fishbase.ropensci.org... 198.199.104.22
Connecting to fishbase.ropensci.org|198.199.104.22|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14377534 (14M) [application/json]
Saving to: `species?limit=5000&offset=0.1'
73% [===================================================> ] 10,567,504 171K/s in 48s
2015-09-16 11:26:02 (213 KB/s) - Connection closed at byte 10567504. Retrying.
--2015-09-16 11:26:03-- (try: 2) http://fishbase.ropensci.org/species?limit=5000&offset=0
Connecting to fishbase.ropensci.org|198.199.104.22|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14377534 (14M) [application/json]
Saving to: `species?limit=5000&offset=0.1'
Could it be that the server is timing out long lasting downloads?
It is failing pretty consistently on my end . . . here's more logs:
--2015-09-16 11:26:03-- (try: 2) http://fishbase.ropensci.org/species?limit=5000&offset=0
Connecting to fishbase.ropensci.org|198.199.104.22|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14377534 (14M) [application/json]
Saving to: `species?limit=5000&offset=0.1'
74% [====================================================> ] 10,748,504 216K/s in 50s
2015-09-16 11:26:55 (209 KB/s) - Connection closed at byte 10748504. Retrying.
--2015-09-16 11:26:57-- (try: 3) http://fishbase.ropensci.org/species?limit=5000&offset=0
Connecting to fishbase.ropensci.org|198.199.104.22|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14377534 (14M) [application/json]
Saving to: `species?limit=5000&offset=0.1'
73% [===================================================> ] 10,631,216 173K/s in 51s
2015-09-16 11:27:50 (202 KB/s) - Connection closed at byte 10748504. Retrying.
--2015-09-16 11:27:53-- (try: 4) http://fishbase.ropensci.org/species?limit=5000&offset=0
Connecting to fishbase.ropensci.org|198.199.104.22|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14377534 (14M) [application/json]
Saving to: `species?limit=5000&offset=0.1'
69% [================================================> ] 10,034,640 257K/s in 47s
2015-09-16 11:28:41 (208 KB/s) - Connection closed at byte 10748504. Retrying.
--2015-09-16 11:28:45-- (try: 5) http://fishbase.ropensci.org/species?limit=5000&offset=0
Connecting to fishbase.ropensci.org|198.199.104.22|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 14377534 (14M) [application/json]
Saving to: `species?limit=5000&offset=0.1'
73% [===================================================> ] 10,638,456 244K/s in 51s
2015-09-16 11:29:38 (203 KB/s) - Connection closed at byte 10748504. Retrying.
can you do the same with curl -v
verbose so I can see the headers returned
@jhpoelen what's your connection speed like where you are?
My (home) connection has ~ 3.3 Mbps according to speednet.net .
For some reason, now the issue no longer occurs. ; ( Both curl -v and wget succeeded fro species?limit=5000&limit=0 and species?limit=5000&limit=1 .
Here's the verbose curl output for successful run:
curl -v "http://fishbase.ropensci.org/species?limit=5000&offset=0" > ~/Desktop/result.json
* Adding handle: conn: 0x7fda61803a00
* Adding handle: send: 0
* Adding handle: recv: 0
* Curl_addHandleToPipeline: length: 1
* - Conn 0 (0x7fda61803a00) send_pipe: 1, recv_pipe: 0
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* About to connect() to fishbase.ropensci.org port 80 (#0)
* Trying 198.199.104.22...
* Connected to fishbase.ropensci.org (198.199.104.22) port 80 (#0)
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0> GET /species?limit=5000&offset=0 HTTP/1.1
> User-Agent: curl/7.30.0
> Host: fishbase.ropensci.org
> Accept: */*
>
0 0 0 0 0 0 0 0 --:--:-- 0:00:10 --:--:-- 0< HTTP/1.1 200 OK
* Server nginx/1.9.4 is not blacklisted
< Server: nginx/1.9.4
< Date: Wed, 16 Sep 2015 18:48:34 GMT
< Content-Type: application/json; charset=utf8
< Content-Length: 14377534
< Connection: close
< Status: 200 OK
< Access-Control-Allow-Methods: HEAD, GET
< Access-Control-Allow-Origin: *
< Cache-Control: public, must-revalidate, max-age=60
< X-Content-Type-Options: nosniff
<
{ [data not shown]
100 13.7M 100 13.7M 0 0 160k 0 0:01:27 0:01:27 --:--:-- 392k
* Closing connection 0
Perhaps the nginx caching directory that sporadically running out of disk space?
Thanks for that.
Perhaps the nginx caching directory that sporadically running out of disk space?
perhaps, we'll look into it.
hey y'all:
I've been poking around the fishbase api . . . thanks for building it!
I am observing a strange content length issue that occurs for larger requests (I am interested in getting all data).
To reproduce:
wget http://fishbase.ropensci.org/fooditems?limit=1000&offset=1600
Expected result: Successful download of first 1000 records in fooditems table.
Actual result: For some reason the content-length is set too high, 'causing wget (and any other http client that checks the content-length) to thing the connection is closed unexpectedly (see log trace below).
Note that this issue does not occur when manually downloading the data using a browser, even though the content-length is off. Also, the issue does not seem to occur when smaller batches are used (e.g. limit=100).
Hope this helps, -jorrit