Open jakubgs opened 5 years ago
When IPFS has been stopped for more than 10 minutes, could you run curl 'localhost:5001/debug/pprof/goroutine?debug=2 > goroutines'
and post the output? That'll help us figure out what's stuck.
IPFS is waiting for more data. Some HTTP implementations don't like it when the server responds before it reads the entire body. I know we had to work around this in go in https://github.com/ipfs/go-ipfs-cmds/pull/116.
When you say "The nginx proxy passes a chunk of data", is that all the remaining data?
Trace:
goroutine 15486 [IO wait, 10 minutes]:
internal/poll.runtime_pollWait(0x7f8061103090, 0x72, 0xffffffffffffffff)
/usr/local/go/src/runtime/netpoll.go:182 +0x56
internal/poll.(*pollDesc).wait(0xc00270d698, 0x72, 0x1000, 0x1000, 0xffffffffffffffff)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x9b
internal/poll.(*pollDesc).waitRead(...)
/usr/local/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Read(0xc00270d680, 0xc00283c000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/usr/local/go/src/internal/poll/fd_unix.go:169 +0x19b
net.(*netFD).Read(0xc00270d680, 0xc00283c000, 0x1000, 0x1000, 0x18b, 0xc0012a1390, 0xbd478a)
/usr/local/go/src/net/fd_unix.go:202 +0x4f
net.(*conn).Read(0xc0000aec78, 0xc00283c000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/usr/local/go/src/net/net.go:177 +0x69
net/http.(*connReader).Read(0xc0028030b0, 0xc00283c000, 0x1000, 0x1000, 0x18b, 0x0, 0x0)
/usr/local/go/src/net/http/server.go:787 +0x107
bufio.(*Reader).Read(0xc002771d40, 0xc00283c000, 0x1000, 0x1000, 0xc002261ec0, 0xc000040570, 0xc000040500)
/usr/local/go/src/bufio/bufio.go:209 +0x126
io.(*LimitedReader).Read(0xc0028165a0, 0xc00283c000, 0x1000, 0x1000, 0xc0012a14f8, 0x913bef, 0x8)
/usr/local/go/src/io/io.go:448 +0x63
net/http.(*body).readLocked(0xc0024c5440, 0xc00283c000, 0x1000, 0x1000, 0x18b, 0x0, 0x0)
/usr/local/go/src/net/http/transfer.go:816 +0x5f
net/http.(*body).Read(0xc0024c5440, 0xc00283c000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
/usr/local/go/src/net/http/transfer.go:808 +0xdb
github.com/ipfs/go-ipfs-cmds/http.bodyWrapper.Read(0x19a3960, 0xc0024c5440, 0xc002816600, 0xc00283c000, 0x1000, 0x1000, 0x160, 0x1000, 0xd)
/go/pkg/mod/github.com/ipfs/go-ipfs-cmds@v0.0.5/http/body.go:13 +0x59
mime/multipart.(*stickyErrorReader).Read(0xc002816640, 0xc00283c000, 0x1000, 0x1000, 0x0, 0x30, 0xc00283c18b)
/usr/local/go/src/mime/multipart/multipart.go:125 +0x66
bufio.(*Reader).fill(0xc002771da0)
/usr/local/go/src/bufio/bufio.go:100 +0x10f
bufio.(*Reader).Peek(0xc002771da0, 0x1, 0xe75, 0xc002800b72, 0x2a, 0x2e, 0xc002800b70)
/usr/local/go/src/bufio/bufio.go:138 +0x4f
mime/multipart.partReader.Read(0xc000eefb90, 0xc001a4664a, 0x1b9b6, 0x1b9b6, 0x18b, 0x0, 0x0)
/usr/local/go/src/mime/multipart/multipart.go:178 +0x169
mime/multipart.(*Part).Read(0xc000eefb90, 0xc001a4664a, 0x1b9b6, 0x1b9b6, 0x18b, 0x0, 0x0)
/usr/local/go/src/mime/multipart/multipart.go:158 +0x52
github.com/ipfs/go-ipfs-files.(*ReaderFile).Read(0xc0024c5580, 0xc001a4664a, 0x1b9b6, 0x1b9b6, 0x18b, 0x0, 0x0)
/go/pkg/mod/github.com/ipfs/go-ipfs-files@v0.0.2/readerfile.go:52 +0x52
github.com/ipfs/go-ipfs/core/coreunix.(*progressReader).Read(0xc0024c5640, 0xc001a4664a, 0x1b9b6, 0x1b9b6, 0x18b, 0x0, 0x0)
/go-ipfs/core/coreunix/add.go:497 +0x5b
github.com/ipfs/go-ipfs/core/coreunix.(*progressReader2).Read(0xc002816740, 0xc001a4664a, 0x1b9b6, 0x1b9b6, 0x18b, 0x0, 0x0)
/go-ipfs/core/coreunix/add.go:517 +0x4c
io.ReadAtLeast(0x1991f40, 0xc002816740, 0xc001a22000, 0x40000, 0x40000, 0x40000, 0x19bbca0, 0xc0000aa080, 0xc0012a1970)
/usr/local/go/src/io/io.go:310 +0x88
io.ReadFull(...)
/usr/local/go/src/io/io.go:329
github.com/ipfs/go-ipfs-chunker.(*sizeSplitterv2).NextBytes(0xc0028034a0, 0x19cf4c0, 0xc0012d49a0, 0x0, 0x0, 0x0)
/go/pkg/mod/github.com/ipfs/go-ipfs-chunker@v0.0.1/splitting.go:86 +0xa2
github.com/ipfs/go-unixfs/importer/helpers.(*DagBuilderHelper).prepareNext(0xc00039def0)
/go/pkg/mod/github.com/ipfs/go-unixfs@v0.0.4/importer/helpers/dagbuilder.go:110 +0x50
github.com/ipfs/go-unixfs/importer/helpers.(*DagBuilderHelper).Done(...)
/go/pkg/mod/github.com/ipfs/go-unixfs@v0.0.4/importer/helpers/dagbuilder.go:120
github.com/ipfs/go-unixfs/importer/balanced.fillNodeRec(0xc00039def0, 0xc0021144d0, 0x1, 0x40000, 0xc00039def0, 0x0, 0x0, 0x158dd40)
/go/pkg/mod/github.com/ipfs/go-unixfs@v0.0.4/importer/balanced/builder.go:232 +0x26e
github.com/ipfs/go-unixfs/importer/balanced.Layout(0xc00039def0, 0x19a1220, 0xc0028034a0, 0xc00039def0, 0x0)
/go/pkg/mod/github.com/ipfs/go-unixfs@v0.0.4/importer/balanced/builder.go:166 +0x14c
github.com/ipfs/go-ipfs/core/coreunix.(*Adder).add(0xc0023572c0, 0x1991f40, 0xc002816740, 0x7f8061154410, 0xc0024c5580, 0xc0024f6b01, 0xc0024f6be0)
/go-ipfs/core/coreunix/add.go:123 +0x217
github.com/ipfs/go-ipfs/core/coreunix.(*Adder).addFile(0xc0023572c0, 0x0, 0x0, 0x19bc160, 0xc0024c5580, 0x1, 0xc001298100)
/go-ipfs/core/coreunix/add.go:395 +0x77
github.com/ipfs/go-ipfs/core/coreunix.(*Adder).addFileNode(0xc0023572c0, 0x0, 0x0, 0x19a12a0, 0xc0024c5580, 0x1, 0x0, 0x0)
/go-ipfs/core/coreunix/add.go:360 +0x1b9
github.com/ipfs/go-ipfs/core/coreunix.(*Adder).AddAllAndPin(0xc0023572c0, 0x19a12a0, 0xc0024c5580, 0x0, 0x0, 0x0, 0x0)
/go-ipfs/core/coreunix/add.go:270 +0xb1
github.com/ipfs/go-ipfs/core/coreapi.(*UnixfsAPI).Add(0xc0003583c0, 0x19bbd20, 0xc002803350, 0x19a12a0, 0xc0024c5580, 0xc002771e60, 0xb, 0xb, 0xc0000d9f98, 0x914491, ...)
/go-ipfs/core/coreapi/unixfs.go:124 +0x474
github.com/ipfs/go-ipfs/core/commands.glob..func6.1(0xc002771ec0, 0xc002771e00, 0x19d1740, 0xc0003583c0, 0xc000eefb20, 0x19a12a0, 0xc0024c5580, 0xc002771e60, 0xb, 0xb)
/go-ipfs/core/commands/add.go:238 +0x115
created by github.com/ipfs/go-ipfs/core/commands.glob..func6
/go-ipfs/core/commands/add.go:234 +0x9e3
What is the status of this bug now?
Waiting on more information. From the stack traces, go-ipfs looks like it's operating correctly.
If you want to try debugging this, try using wireshark to dump the http request/response between nginx and go-ipfs.
@johanherman did some research about this problem:TLDR:
client <-> socat <-> ALB <-> socat <-> server
)When a request with
progress=true&stream-channels=true
andTransfer-Encoding: chunked
is made the client starts sending chunks, and after a while the daemon will ACK with aHTTP/1.1 200 OK
in the middle of the stream. The client continues to send further chunks, and periodically the daemon will respond with small JSON messages telling the client how many bytes have been processed of the file. Eventually the client will send the ending boundary indicator followed by a0\r
msg, so that the daemon knows that the client is done. The daemon responds with the last remaining JSON status updates, and then in the end will respond with a message containing the hash, and then a last0\r
message as well, to terminate everything from the daemon side.If we make the same request via an ALB it will be similar. Daemon ACKs after a while during the stream of chunks (that are now of different sizes). Daemon has time to send some JSON status updates, but eventually the daemon will respond with ending terminator
0\r
and the error msgX-Stream-Error: http: unexpected EOF reading trailer
. I.e. from the viewpoint of the server the connection has been terminated.Last chunk coming from the client (via the ALB) will usually be suspiciously small. I.e. it is as the something went wrong with the ALB when it re-packaging that chunk.
Something that is weird though is that if I now e.g. bounce my traffic one extra step before the ALB, then things will still work. I.e. so we have:
client <-> socat <-> ALB <-> socat <-> server
.It is also possible to make it work by disabling the progress updates. Here this was done by adding the silent flag
-q
to go-ipfs, i.e. so we havego-ipfs --api /ip4/<ip>/tcp/<port> add -q --pin <file>
.
Note: we have report saying that using progress=false
doesn't entirely fix the problem.
This will still send the header
Transfer-Encoding: chunked
, but now with the argsquiet=true&stream-channels=true
. The ALB will still re-chunk everything, but this time the daemon will wait, and not send any status updates. As expected after the ending boundary and terminator is sent by the client the server then ACKs with aHTTP/1.1 200 OK
containing the JSON msg with the uploaded hash, followed by a terminating message.(Another thing to note is that quiet go-ipfs client uploads are much quicker, and there seems to be a related issue about it: https://github.com/ipfs/go-ipfs/issues/4297)
From my tests with wireshark, nginx stops sending content to ipfs from the very moment it receives the first response chunk (the progress object). When and how that happens is controlled by the proxy buffering options in nginx so it may looks like it works for a bit and then stalls.
I don't think we can do much about this.
Hi everyone,
I was able to make progress
option work behind Nginx proxy using these configuration options in location
section:
proxy_request_buffering off;
proxy_buffering off;
proxy_http_version 1.1;
I first noticed the issue while using ALB, but then I switched to NLB to simply proxy TCP connection to a target group.
I dug deeper since I wanted to enable SSL, so I put Nginx in front of each IPFS node to expose it using HTTPS and got progress option via disabling buffer and enabling HTTP/1.1 for proxying.
Thanks for solving this! I've added it to some documentation in #7469.
I tried the solution suggested by @dsemenovsky and it does not seem to help:
> dd if=/dev/urandom of=file.txt count=10 bs=1M
10+0 records in
10+0 records out
10485760 bytes (10 MB, 10 MiB) copied, 0.110425 s, 95.0 MB/s
> curl -k --max-time 5 -F 'file=@file.txt' -H 'host: ipfs.example.org' 'https://localhost:443/api/v0/add?progress=true'
{"Name":"","Hash":"b","Bytes":262144}
{"Name":"","Hash":"b","Bytes":524288}
{"Name":"","Hash":"b","Bytes":786432}
{"Name":"","Hash":"b","Bytes":1048576}
{"Name":"","Hash":"b","Bytes":1310720}
{"Name":"","Hash":"b","Bytes":1572864}
{"Name":"","Hash":"b","Bytes":1835008}
{"Name":"","Hash":"b","Bytes":2097152}
{"Name":"","Hash":"b","Bytes":2359296}
{"Name":"","Hash":"b","Bytes":2621440}
{"Name":"","Hash":"b","Bytes":2883584}
{"Name":"","Hash":"b","Bytes":3145728}
{"Name":"","Hash":"b","Bytes":3407872}
{"Name":"","Hash":"b","Bytes":3670016}
{"Name":"","Hash":"b","Bytes":3932160}
curl: (28) Operation timed out after 5001 milliseconds with 582 bytes received
I'm running ipfs/go-ipfs:v0.5.1
.
Hey @jakubgs can you post your nginx config (location section) and tell me what other load balancers you have in front of nginx?
No other lbs, as you can see I'm querying port 443
from localhost
to talk directly to Nginx.
My location
section has nothing but this:
location /api/v0/add {
proxy_pass http://localhost:9095;
}
The settings you suggested were put under server
.
Try having only one section that proxies all paths to your IPFS, and try putting everything in the location section.
That’s what I did, and if it works for you too it would be easier for you to debug once you start moving directives to /api/v0/add location one by one.
It's pointless to try that since the settings are marked as affecting both location
and server
blocks:
https://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_buffering
But I tried it just for you, and it didn't help either.
Sorry it didn’t.
Yeah, it's tricky. And just to be clear, I tried all 6 combinations of the options, directly in the location
field.
It did work once with just proxy_buffering
, but I couldn't reproduce it, and it does sometimes work with smaller files(~7-8MB) without good reason and randomly.
By the way, I am using Nginx 1.16 if that helps.
IPFS is v0.5.1 same as yours.
— Dmitry Semenovsky +1 (416) 455-0212
On Jun 15, 2020, at 10:20 AM, Jakub notifications@github.com wrote:
Yeah, it's tricky. And just to be clear, I tried all 6 combinations of the options, directly in the location field.
It did work once with just proxy_buffering, but I couldn't reproduce it, and it does sometimes work with smaller files(~7-8MB) without good reason and randomly.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.
@Stebalien instead of streaming progress updates from go-ipfs to the client, would it be possible to compute the progress client-side only by metering how much data has been received ? It would avoid having this sort of bidirectional HTTP request which is at least a gray area if you read the RFCs.
instead of streaming progress updates from go-ipfs to the client, would it be possible to compute the progress client-side only by metering how much data has been received ?
For just progress, yes. But ipfs add
also returns the CIDs of each file it adds.
But yes, clients that can't handle bidirectional streaming can implement client-side progress.
Why was this closed? Did anyone confirm the fix actually works?
It was closed automatically because I merged a linked issue.
However, it's not a bug in go-ipfs itself so it really doesn't belong in the issue tracker (and we can't do anything about it other than try to help you debug yoru setup). If you're looking for more help on this, you should try reaching out on the forums and/or IRC/Matrix.
go-ipfs
not being able to work with the most popular http server/proxy is not IPFS issue? Really?
Well, you've prompted me to dig into this a bit more. I assumed that @dsemenovsky's fix worked, but I couldn't replicate it. A bit of searching turned up:
Assuming these are up to date, I'm not sure what we can do other than to ask upstream to fix the bug (or someone could submit a patch). We're planning on eventually replacing the API with something better (probably a websocket based CBOR-RPC transport), but that's not a short term solution.
Note: this isn't just about ipfs add --progress
. We've designed our commands system to stream:
ipfs add foo bar baz
will stream back CIDs for foo
, bar
, baz
.ipfs block put file1 file2...
will also stream back CIDs.ipfs cat
, ipfs get
, etc. are supposed to stream the set IPFS paths to read from the request body, but currently read them in all at once up-front (but there's a TODO to fix this).For the moment, I can do is add some documentation explaining why this doesn't work. But it's not going to be a "quick fix" unless someone can convince NGINX to fix the issue on their end.
@MichaelMure have you never run into this in production?
@Stebalien we do and are experimenting to find a solution. Notably, AWS ALB (layer 7) seems to be picky about this, we are going to migrate over NLB (layer 4). It will likely not be enough though.
My understanding is that (for a good reason or not), a lot of networking code around there assume that a HTTP response can only come after the request has been fully received by the server. It's also the case in net/http
in go, even if you hijack the TCP connection from there. It might be possible but that looks like a major pain.
Yeah, we've run into the net/http
bug: https://github.com/ipfs/go-ipfs/issues/5168. However, we were able to work around it (sort of).
@dsemenovsky's fix worked for me with nginx:latest
container (version 1.19)
It's worth noting that @dsemenovsky was on Nginx 1.16.0
while I was testing this fix on 1.14.0
.
Hm. I tried the fix on 1.18 and 1.19 and it failed with a large (10GiB) directory:
worker_processes 1;
events {
worker_connections 1024;
}
http {
default_type application/octet-stream;
server {
listen 127.0.0.1:80;
server_name localhost;
location / {
proxy_request_buffering off;
proxy_buffering off;
proxy_http_version 1.1;
client_max_body_size 100G;
proxy_pass http://localhost:5001;
}
}
}
I've also tried inserting socat between nginx, the go-ipfs client, and the go-ipfs daemon. No dice.
Used proposed solution. Got an error After using ipfs.files.add:
argument "key" is required
Hash still returns but only reachable through same node as uploaded from. IPFS Gateway returns a 502.
I've upgraded my IPFS cluster hosts to Ubuntu 20.04
which gave me Nginx 1.18.0
but that did not make it work:
> nginx -v
nginx version: nginx/1.18.0 (Ubuntu)
> sudo grep -A 3 'Fix' /etc/nginx/sites-enabled/ipfs.example.org.conf
# Fix uploads of big files
proxy_request_buffering off;
proxy_buffering off;
proxy_http_version 1.1;
> dd if=/dev/urandom of=file.txt count=10 bs=1M
10+0 records in
10+0 records out
10485760 bytes (10 MB, 10 MiB) copied, 0.114654 s, 91.5 MB/s
> curl -k --max-time 10 -F 'file=@file.txt' -H 'host: ipfs.example.org' 'https://localhost:443/api/v0/add?progress=true'
{"Name":"","Hash":"b","Bytes":262144}
{"Name":"","Hash":"b","Bytes":524288}
{"Name":"","Hash":"b","Bytes":786432}
{"Name":"","Hash":"b","Bytes":1048576}
{"Name":"","Hash":"b","Bytes":1310720}
{"Name":"","Hash":"b","Bytes":1572864}
{"Name":"","Hash":"b","Bytes":1835008}
{"Name":"","Hash":"b","Bytes":2097152}
{"Name":"","Hash":"b","Bytes":2359296}
{"Name":"","Hash":"b","Bytes":2621440}
{"Name":"","Hash":"b","Bytes":2883584}
{"Name":"","Hash":"b","Bytes":3145728}
{"Name":"","Hash":"b","Bytes":3407872}
{"Name":"","Hash":"b","Bytes":3670016}
{"Name":"","Hash":"b","Bytes":3932160}
{"Name":"","Hash":"b","Bytes":4194304}
{"Name":"","Hash":"b","Bytes":4456448}
{"Name":"","Hash":"b","Bytes":4718592}
{"Name":"","Hash":"b","Bytes":4980736}
{"Name":"","Hash":"b","Bytes":5242880}
{"Name":"","Hash":"b","Bytes":5505024}
curl: (28) Operation timed out after 10000 milliseconds with 816 bytes received
@ryanshahine you have a different issue. Please ask on https://discuss.ipfs.io.
@jakubgs I can confirm. I can't get this working on any version of nginx.
Managed to get this work by adding client_max_body_size 0;
to @dsemenovsky answer
client_max_body_size 0;
proxy_request_buffering off;
proxy_buffering off;
proxy_http_version 1.1;
Hm. I tried the fix on 1.18 and 1.19 and it failed with a large (10GiB) directory:
worker_processes 1; events { worker_connections 1024; } http { default_type application/octet-stream; server { listen 127.0.0.1:80; server_name localhost; location / { proxy_request_buffering off; proxy_buffering off; proxy_http_version 1.1; client_max_body_size 100G; proxy_pass http://localhost:5001; } } }
Managed to get this work by adding
client_max_body_size 0;
to @dsemenovsky answer
I fail to see how this would do anything. This setting is not related. Likely it just works by chance because the request body upload finishes before the response starts being written.
Note the error usually appears sporadically.
Managed to get this work by adding
client_max_body_size 0;
to @dsemenovsky answerI fail to see how this would do anything. This setting is not related. Likely it just works by chance because the request body upload finishes before the response starts being written.
Note the error usually appears sporadically.
Solved my problem, can toggle it on and off by adding or removing that line.
Version
I'm using a docker container: `ipfs/go-ipfs:v0.4.20``
Description
When using the
progress=true
flag with the/api/v0/add
API path the upload just stops midway for files larget than a few MB(the larger the more likely to stop).Steps to reproduce
Configure an Nginx reverse proxy for port
5001
:Create a 4MB file:
Attempt to upload it with
progress=true
enabled:Create another file that is 10MB:
Try to upload again:
And then it just stops, indefinitely.
If the
progress=true
option is omitted everything works just fine.Details
Naturally the upload works just fine without the proxy. I've tried watching the traffic with
tcpdump
to figure out who stops responding and it seems to be the IPFS service. The nginx proxy passes a chunk of data, and then IPFS doesn't respond, and everything stops.