rclone / rclone

"rsync for cloud storage" - Google Drive, S3, Dropbox, Backblaze B2, One Drive, Swift, Hubic, Wasabi, Google Cloud Storage, Azure Blob, Azure Files, Yandex Files
https://rclone.org
MIT License
45.15k stars 4.04k forks source link

performance is shot on minio over tls #3835

Closed CosmicToast closed 4 years ago

CosmicToast commented 4 years ago

What is the problem you are having with rclone?

Under some circumstances, performance is shot (more on the circumstances below) on minio over tls. Rclone will transfer ~45MB, after which it stalls (and MB/s keeps going down). I can replicate this with the latest beta but not with latest mcli.

What is your rclone version (output from rclone version)

rclone v1.50.2-094-g207474ab-beta
- os/arch: linux/amd64
- go version: go1.13.5

Which OS you are using and how many bits (eg Windows 7, 64 bit)

Arch Linux, Amd64

Which cloud storage system are you using? (eg Google Drive)

Minio

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone copy ./largefile remote:tmp -P

A log from the command with the -vv flag (eg output from rclone -vv copy /tmp remote:tmp)

2019/12/30 20:41:08 DEBUG : rclone: Version "v1.50.2-094-g207474ab-beta" starting with parameters ["./rclone" "copy" "../large.zip" "fafnir:shared/tmp" "-vv"]
2019/12/30 20:41:08 DEBUG : Using config file from "/home/toast/.config/rclone/rclone.conf"
2019/12/30 20:41:08 DEBUG : large.zip: Need to transfer - File not found at Destination

it copies 45M almost instantly, after which it stalls for a long time then progress jumps to 65M almost instantly, and the stalling continues, until it jumps by 20M more (and the cycle repeats) The speed will occasionally jump to 600kb/s (whenever a bit of progress is made), and then drops in freefall again No further debug lines show up

Further information on the setup

I have a gateway running Alpine Linux and Caddy. It reverse proxies back to fafnir (the minio host) which is running in an lxd container in the internal network.

This does not trigger if any of the following are changed: a) I use a different client (only checked with mcli, which is consistent at >100MB/s) b) change rclone.conf to refer to http://fafnir:9000 in place of https://minio.host

CosmicToast commented 4 years ago

As an update, I have checked using aws-cli and s3cmd (using instructions from minio 1 and 2). They perform as expected as well.

ncw commented 4 years ago

The problem seems to be caused by the introduction of the proxy - is that right?

I doubt it is https in particular as S3 transactions are normally done via https.

Have you tried doing an wireshark dump to see if you can see what is going on?

Are there any hints in the proxy log?

CosmicToast commented 4 years ago

I ran tcpdump to see what would happen, and it looks like rclone simply isn't sending packets (I let it get to 45M, then 65M, then 85M).

every time, at some point, rclone will send some application data, the server will return with an ack, and then silence 15 seconds later the server sends a keep-alive packet that rclone responds to this repeats 3 times, 15 seconds apart each time then rclone sends ~10 acks/psh,acks in a row, after which the transfer seems to continue

I'm not particularly well-versed in networking (especially not on the tcp/ip level), so that means very little to me, and I'd rather not send the pcap file (given that it does have auth, ips and so on in it) however, considering that this behavior does not happen with minio-client, aws-cli or s3cmd I doubt we can blame the reverse-proxy further, it used to work fine some time ago, I might try to bisect for the breaking commit at some point (though depending on the introduction of go modules usage that may be fruitless)

ncw commented 4 years ago

I doubt we can blame the reverse-proxy

It will be an interaction between rclone and the reverse proxy... Maybe rclone is using the proxy differently (eg using HTTP/2) or something like that.

, I might try to bisect for the breaking commit at some point (though depending on the introduction of go modules usage that may be fruitless)

That would be really really useful if you could. Rclone is fully vendored so build rclone with go build -mod=vendor and git bisect should do the trick.

CosmicToast commented 4 years ago

I tried to find a bisect good, and went all the way back to 6396872d7 (the first release with go modules)... and it was still bad. I did try to use dep to build the version before that but it's just hopelessly broken.

I then toyed around with the various advanced configurations, and had the following revelations:

  1. I set the concurrency to "1". Once that was done, the "first pass" (45M quoted above) became exactly 3x chunk size (15M by default, once I set chunk size to 50M it got stuck on 150M).
  2. If I set upload_cutoff to "5G" (large.zip is around 2.5G), the issue mostly disappears (it takes a little bit to start up, even though I had checksums disabled at that point (for faster iteration), but then it goes at only 1/3 of the expected speed (~37MB/s, while ~100MB/s is expected based on other tools)).

From this it appears to be some edge case in rclone's chunking process.

ncw commented 4 years ago

Great, looks like we are getting somewhere.

Interestingly I just re-wrote the s3 chunked uploader in 7e6fac8b1eaeccbf48e197e2c1f12627c3e69ca5 - fancy trying the latest beta?

Can you run your test with -vv --dump headers and post the resulting log here? That will let us see which requests exactly are stalling.

CosmicToast commented 4 years ago

Latest beta (g9d993e58) - same behavior, jumps to 20M (guessing 1 chunk x 4 threads). Looking at --vv --dump headers output, it looks like minio is either not sending 100-Continue, or rclone is not interpreting them (see dump at the end). (the --bind argument is for the network dump)

I have a wireshark dump of the entire interaction (clean thanks to --bind), though I'm not sure how/what to interpret, given that I don't/can't see http protocol data (it's all TLSv1.3, for obvious reasons).

2020/01/05 14:58:49 DEBUG : Using config file from "/home/toast/.config/rclone/rclone.conf"
2020/01/05 14:58:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:58:49 DEBUG : HTTP REQUEST (req 0xc00011ff00)
2020/01/05 14:58:49 DEBUG : HEAD /tmp/large.zip HTTP/1.1
Host: minio.toast.cafe
User-Agent: rclone/v1.50.2-109-g9d993e58-beta
Authorization: XXXX
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20200105T195849Z

2020/01/05 14:58:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:58:49 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2020/01/05 14:58:49 DEBUG : HTTP RESPONSE (req 0xc00011ff00)
2020/01/05 14:58:49 DEBUG : HTTP/2.0 404 Not Found
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Date: Sun, 05 Jan 2020 19:58:01 GMT
Server: Caddy
Server: MinIO/DEVELOPMENT.2019-12-22T23-03-51Z
Vary: Origin
X-Amz-Request-Id: 15E715E831E8989D
X-Xss-Protection: 1; mode=block
Content-Length: 0

2020/01/05 14:58:49 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2020/01/05 14:58:49 DEBUG : large.zip: Need to transfer - File not found at Destination
2020/01/05 14:58:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:58:49 DEBUG : HTTP REQUEST (req 0xc00011f300)
2020/01/05 14:58:49 DEBUG : PUT /tmp HTTP/1.1
Host: minio.toast.cafe
User-Agent: rclone/v1.50.2-109-g9d993e58-beta
Content-Length: 0
Authorization: XXXX
X-Amz-Acl: private
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20200105T195849Z
Accept-Encoding: gzip

2020/01/05 14:58:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:58:49 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2020/01/05 14:58:49 DEBUG : HTTP RESPONSE (req 0xc00011f300)
2020/01/05 14:58:49 DEBUG : HTTP/2.0 409 Conflict
Content-Length: 336
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Content-Type: application/xml
Date: Sun, 05 Jan 2020 19:58:01 GMT
Server: Caddy
Server: MinIO/DEVELOPMENT.2019-12-22T23-03-51Z
Vary: Origin
X-Amz-Request-Id: 15E715E832392E2A
X-Xss-Protection: 1; mode=block

2020/01/05 14:58:49 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2020/01/05 14:58:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:58:49 DEBUG : HTTP REQUEST (req 0xc00011fb00)
2020/01/05 14:58:49 DEBUG : POST /tmp/large.zip?uploads= HTTP/1.1
Host: minio.toast.cafe
User-Agent: rclone/v1.50.2-109-g9d993e58-beta
Content-Length: 0
Authorization: XXXX
Content-Type: application/zip
X-Amz-Acl: private
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20200105T195849Z
X-Amz-Meta-Mtime: 1577756410.393861829
Accept-Encoding: gzip

2020/01/05 14:58:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:58:49 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2020/01/05 14:58:49 DEBUG : HTTP RESPONSE (req 0xc00011fb00)
2020/01/05 14:58:49 DEBUG : HTTP/2.0 200 OK
Content-Length: 247
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Content-Type: application/xml
Date: Sun, 05 Jan 2020 19:58:01 GMT
Server: Caddy
Server: MinIO/DEVELOPMENT.2019-12-22T23-03-51Z
Vary: Origin
X-Amz-Request-Id: 15E715E8326AF2AC
X-Xss-Protection: 1; mode=block

2020/01/05 14:58:49 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2020/01/05 14:58:49 DEBUG : large.zip: multipart upload starting chunk 1 size 5M offset 0/2.313G
2020/01/05 14:58:49 DEBUG : large.zip: multipart upload starting chunk 2 size 5M offset 5M/2.313G
2020/01/05 14:58:49 DEBUG : large.zip: multipart upload starting chunk 3 size 5M offset 10M/2.313G
2020/01/05 14:58:49 DEBUG : large.zip: multipart upload starting chunk 4 size 5M offset 15M/2.313G
2020/01/05 14:58:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:58:49 DEBUG : HTTP REQUEST (req 0xc0001ea100)
2020/01/05 14:58:49 DEBUG : PUT /tmp/large.zip?partNumber=2&uploadId=50c8fc80-4d19-455b-95c2-2e3fe5ec48ce HTTP/1.1
Host: minio.toast.cafe
User-Agent: rclone/v1.50.2-109-g9d993e58-beta
Content-Length: 5242880
Authorization: XXXX
Content-Md5: rm8YIpRZr1VAT0WjQdTuZg==
Expect: 100-Continue
X-Amz-Content-Sha256: d452ec046f518cab4e529424d96a56ac4ade72b14594d904c62a3ffe2969c64f
X-Amz-Date: 20200105T195849Z
Accept-Encoding: gzip

2020/01/05 14:58:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:58:49 DEBUG : HTTP REQUEST (req 0xc000660600)
2020/01/05 14:58:49 DEBUG : PUT /tmp/large.zip?partNumber=1&uploadId=50c8fc80-4d19-455b-95c2-2e3fe5ec48ce HTTP/1.1
Host: minio.toast.cafe
User-Agent: rclone/v1.50.2-109-g9d993e58-beta
Content-Length: 5242880
Authorization: XXXX
Content-Md5: +ylA9ARqtAvbJl1ioE7jNQ==
Expect: 100-Continue
X-Amz-Content-Sha256: b0acc3e09e3a597c1e1eb91cc8cb94445269e6ea5666e7ef360b475e86c7f0b5
X-Amz-Date: 20200105T195849Z
Accept-Encoding: gzip

2020/01/05 14:58:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:58:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:58:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:58:49 DEBUG : HTTP REQUEST (req 0xc0004f0100)
2020/01/05 14:58:49 DEBUG : PUT /tmp/large.zip?partNumber=3&uploadId=50c8fc80-4d19-455b-95c2-2e3fe5ec48ce HTTP/1.1
Host: minio.toast.cafe
User-Agent: rclone/v1.50.2-109-g9d993e58-beta
Content-Length: 5242880
Authorization: XXXX
Content-Md5: IF7JBZg43y81BuGW8Tjfdw==
Expect: 100-Continue
X-Amz-Content-Sha256: 2d69b308e7965c8e229aace7c5946efd031c15c27dfcd56602f7eae163c35148
X-Amz-Date: 20200105T195849Z
Accept-Encoding: gzip

2020/01/05 14:58:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:58:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:58:49 DEBUG : HTTP REQUEST (req 0xc0004e8700)
2020/01/05 14:58:49 DEBUG : PUT /tmp/large.zip?partNumber=4&uploadId=50c8fc80-4d19-455b-95c2-2e3fe5ec48ce HTTP/1.1
Host: minio.toast.cafe
User-Agent: rclone/v1.50.2-109-g9d993e58-beta
Content-Length: 5242880
Authorization: XXXX
Content-Md5: awvDTVgGYddnYcf6QpW8Vw==
Expect: 100-Continue
X-Amz-Content-Sha256: 01e17b4d847163794a07831273c8be9d15852ce5f358767142faf2a8d0f33545
X-Amz-Date: 20200105T195849Z
Accept-Encoding: gzip

2020/01/05 14:58:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:59:49 INFO  :
Transferred:           20M / 2.313 GBytes, 1%, 342.082 kBytes/s, ETA 1h57m11s
Transferred:            0 / 1, 0%
Elapsed time:       59.8s
Transferring:
 *                                     large.zip:  0% /2.313G, 79.784k/s, 8h22m26s

2020/01/05 14:59:49 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2020/01/05 14:59:49 DEBUG : HTTP RESPONSE (req 0xc0001ea100)
2020/01/05 14:59:49 DEBUG : HTTP/2.0 200 OK
Content-Length: 0
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Date: Sun, 05 Jan 2020 19:59:01 GMT
Etag: "ae6f18229459af55404f45a341d4ee66"
Server: Caddy
Server: MinIO/DEVELOPMENT.2019-12-22T23-03-51Z
Vary: Origin
X-Amz-Request-Id: 15E715E83C8EDDDB
X-Xss-Protection: 1; mode=block

2020/01/05 14:59:49 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2020/01/05 14:59:49 DEBUG : large.zip: multipart upload starting chunk 5 size 5M offset 20M/2.313G
2020/01/05 14:59:49 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2020/01/05 14:59:49 DEBUG : HTTP RESPONSE (req 0xc0004f0100)
2020/01/05 14:59:49 DEBUG : HTTP/2.0 200 OK
Content-Length: 0
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Date: Sun, 05 Jan 2020 19:59:01 GMT
Etag: "205ec9059838df2f3506e196f138df77"
Server: Caddy
Server: MinIO/DEVELOPMENT.2019-12-22T23-03-51Z
Vary: Origin
X-Amz-Request-Id: 15E715E83CB6F5C8
X-Xss-Protection: 1; mode=block

2020/01/05 14:59:49 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2020/01/05 14:59:49 DEBUG : large.zip: multipart upload starting chunk 6 size 5M offset 25M/2.313G
2020/01/05 14:59:49 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2020/01/05 14:59:49 DEBUG : HTTP RESPONSE (req 0xc000660600)
2020/01/05 14:59:49 DEBUG : HTTP/2.0 200 OK
Content-Length: 0
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Date: Sun, 05 Jan 2020 19:59:01 GMT
Etag: "fb2940f4046ab40bdb265d62a04ee335"
Server: Caddy
Server: MinIO/DEVELOPMENT.2019-12-22T23-03-51Z
Vary: Origin
X-Amz-Request-Id: 15E715E83C64F16B
X-Xss-Protection: 1; mode=block

2020/01/05 14:59:49 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2020/01/05 14:59:49 DEBUG : large.zip: multipart upload starting chunk 7 size 5M offset 30M/2.313G
2020/01/05 14:59:49 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2020/01/05 14:59:49 DEBUG : HTTP RESPONSE (req 0xc0004e8700)
2020/01/05 14:59:49 DEBUG : HTTP/2.0 200 OK
Content-Length: 0
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Date: Sun, 05 Jan 2020 19:59:01 GMT
Etag: "6b0bc34d580661d76761c7fa4295bc57"
Server: Caddy
Server: MinIO/DEVELOPMENT.2019-12-22T23-03-51Z
Vary: Origin
X-Amz-Request-Id: 15E715E83D401D0B
X-Xss-Protection: 1; mode=block

2020/01/05 14:59:49 DEBUG : <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2020/01/05 14:59:49 DEBUG : large.zip: multipart upload starting chunk 8 size 5M offset 35M/2.313G
2020/01/05 14:59:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:59:49 DEBUG : HTTP REQUEST (req 0xc0004f0300)
2020/01/05 14:59:49 DEBUG : PUT /tmp/large.zip?partNumber=5&uploadId=50c8fc80-4d19-455b-95c2-2e3fe5ec48ce HTTP/1.1
Host: minio.toast.cafe
User-Agent: rclone/v1.50.2-109-g9d993e58-beta
Content-Length: 5242880
Authorization: XXXX
Content-Md5: CkF+uhvlWhTHjbtVUsoFdQ==
Expect: 100-Continue
X-Amz-Content-Sha256: 5f28464144e06bc41f9b67d818f38a0edc621fb495ca7d11a69bfc40a2d8029a
X-Amz-Date: 20200105T195949Z
Accept-Encoding: gzip

2020/01/05 14:59:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:59:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:59:49 DEBUG : HTTP REQUEST (req 0xc0001ea300)
2020/01/05 14:59:49 DEBUG : PUT /tmp/large.zip?partNumber=6&uploadId=50c8fc80-4d19-455b-95c2-2e3fe5ec48ce HTTP/1.1
Host: minio.toast.cafe
User-Agent: rclone/v1.50.2-109-g9d993e58-beta
Content-Length: 5242880
Authorization: XXXX
Content-Md5: RsyKSYPTSkEONBr0BAQORA==
Expect: 100-Continue
X-Amz-Content-Sha256: c9a1336f232d50723307e9f9672ebc08467a5c59ed30edbcf10266024bc016b0
X-Amz-Date: 20200105T195949Z
Accept-Encoding: gzip

2020/01/05 14:59:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:59:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:59:49 DEBUG : HTTP REQUEST (req 0xc0004e8100)
2020/01/05 14:59:49 DEBUG : PUT /tmp/large.zip?partNumber=7&uploadId=50c8fc80-4d19-455b-95c2-2e3fe5ec48ce HTTP/1.1
Host: minio.toast.cafe
User-Agent: rclone/v1.50.2-109-g9d993e58-beta
Content-Length: 5242880
Authorization: XXXX
Content-Md5: iwq+8Ml/Q+fcWbYvwHWY/A==
Expect: 100-Continue
X-Amz-Content-Sha256: fffbbfc2ec010827f28a87e303bb4d65993eeac24bd78a10de8b42c6bf11d19c
X-Amz-Date: 20200105T195949Z
Accept-Encoding: gzip

2020/01/05 14:59:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:59:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2020/01/05 14:59:49 DEBUG : HTTP REQUEST (req 0xc00011f300)
2020/01/05 14:59:49 DEBUG : PUT /tmp/large.zip?partNumber=8&uploadId=50c8fc80-4d19-455b-95c2-2e3fe5ec48ce HTTP/1.1
Host: minio.toast.cafe
User-Agent: rclone/v1.50.2-109-g9d993e58-beta
Content-Length: 5242880
Authorization: XXXX
Content-Md5: TLt9/2Q6j0DCIKZwUefpPQ==
Expect: 100-Continue
X-Amz-Content-Sha256: e686b06972899a792d0fe1f41081daee2cc269b30404007ff26aa5fbf49d9324
X-Amz-Date: 20200105T195949Z
Accept-Encoding: gzip

2020/01/05 14:59:49 DEBUG : >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
ncw commented 4 years ago

There definitely appears to be a 1 minute timeout doesn't there from request to response.

I guess that might be the delay if the server doesn't support expect/continue properly. Rclone's expect continue timeout is set to 1 second I think.

Do you think the proxy doesn't support expect continue properly?

CosmicToast commented 4 years ago

I believe expecting 100-Continue breaks s3 specification. First, find attached (at the bottom) the output of minio-client with the debug flag (similar to the above) uploading at a fast rate (cancelled at an arbitrary point). Note the lack of Expect: and 100 responses.

Now, why are those missing? I took a look at the s3 specification. The multipart upload docs have a dedicated REST API section, which documents the upload part procedure.

Here is the summary: You send something like this:

PUT /my-movie.m2ts?partNumber=1&uploadId=VCVsb2FkIElEIGZvciBlbZZpbmcncyBteS1tb3ZpZS5tMnRzIHVwbG9hZR HTTP/1.1
Host: example-bucket.s3.amazonaws.com
Date:  Mon, 1 Nov 2010 20:34:56 GMT
Content-Length: 10485760
Content-MD5: pUNXr/BjKK5G2UKvaRRrOA==
Authorization: authorization string
>>>>>>> reefconvert-joe

***part data omitted***

And you receive something like this when it ends.


HTTP/1.1 200 OK
x-amz-id-2: Vvag1LuByRx9e6j5Onimru9pO4ZVKnJ2Qz7/C1NPcfTWAtRPfTaOFg==
x-amz-request-id: 656c76696e6727732072657175657374
Date:  Mon, 1 Nov 2010 20:34:56 GMT
ETag: "b54357faf0632cce46e942fa68356b38"
Content-Length: 0
Connection: keep-alive
Server: AmazonS3

Under this assumption, the current rclone performance should also be terrible against real aws, because it will expect 100 (and never receive it, as per the specification). I don't have any aws storage instance to test against though. However, based on all of this, my understanding is that 100-Continue should not be used or present in multipart uploads. Am I missing something?

(mcli output, let it go to ~500mb completion, only took a few seconds)

mcli: <DEBUG> GET /tmp/?location= HTTP/1.1
Host: minio.toast.cafe
User-Agent: MinIO (linux; amd64) minio-go/v6.0.45 mcli/2019-12-17T23:26:28Z
Authorization: AWS4-HMAC-SHA256 Credential=**REDACTED**/20200107/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: UNSIGNED-PAYLOAD
X-Amz-Date: 20200107T014440Z
Accept-Encoding: gzip

mcli: <DEBUG> HTTP/1.1 200 OK
Content-Length: 128
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Content-Type: application/xml
Date: Tue, 07 Jan 2020 01:43:50 GMT
Server: Caddy
Server: MinIO/DEVELOPMENT.2019-12-22T23-03-51Z
Vary: Origin
X-Amz-Request-Id: 15E7775BDBF71648
X-Xss-Protection: 1; mode=block

mcli: <DEBUG> Response Time:  20.573756ms

mcli: <DEBUG> HEAD /tmp/ HTTP/1.1
Host: minio.toast.cafe
User-Agent: MinIO (linux; amd64) minio-go/v6.0.45 mcli/2019-12-17T23:26:28Z
Authorization: AWS4-HMAC-SHA256 Credential=**REDACTED**/20200107/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20200107T014440Z

mcli: <DEBUG> HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Date: Tue, 07 Jan 2020 01:43:50 GMT
Server: Caddy
Server: MinIO/DEVELOPMENT.2019-12-22T23-03-51Z
Vary: Origin
X-Amz-Request-Id: 15E7775BDC2CFF43
X-Xss-Protection: 1; mode=block
Content-Length: 0

mcli: <DEBUG> Response Time:  3.059448ms

mcli: <DEBUG> HEAD /tmp/ HTTP/1.1
Host: minio.toast.cafe
User-Agent: MinIO (linux; amd64) minio-go/v6.0.45 mcli/2019-12-17T23:26:28Z
Authorization: AWS4-HMAC-SHA256 Credential=**REDACTED**/20200107/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20200107T014440Z

mcli: <DEBUG> HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Date: Tue, 07 Jan 2020 01:43:50 GMT
Server: Caddy
Server: MinIO/DEVELOPMENT.2019-12-22T23-03-51Z
Vary: Origin
X-Amz-Request-Id: 15E7775BDC59FC69
X-Xss-Protection: 1; mode=block
Content-Length: 0

mcli: <DEBUG> Response Time:  2.470063ms

mcli: <DEBUG> HEAD /tmp/ HTTP/1.1
Host: minio.toast.cafe
User-Agent: MinIO (linux; amd64) minio-go/v6.0.45 mcli/2019-12-17T23:26:28Z
Authorization: AWS4-HMAC-SHA256 Credential=**REDACTED**/20200107/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20200107T014440Z

mcli: <DEBUG> HTTP/1.1 200 OK
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Date: Tue, 07 Jan 2020 01:43:50 GMT
Server: Caddy
Server: MinIO/DEVELOPMENT.2019-12-22T23-03-51Z
Vary: Origin
X-Amz-Request-Id: 15E7775BDC887F9C
X-Xss-Protection: 1; mode=block
Content-Length: 0

mcli: <DEBUG> Response Time:  2.166963ms

mcli: <DEBUG> POST /tmp/large.zip?uploads= HTTP/1.1
Host: minio.toast.cafe
User-Agent: MinIO (linux; amd64) minio-go/v6.0.45 mcli/2019-12-17T23:26:28Z
Content-Length: 0
Authorization: AWS4-HMAC-SHA256 Credential=**REDACTED**/20200107/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
Content-Type: application/zip
X-Amz-Content-Sha256: UNSIGNED-PAYLOAD
X-Amz-Date: 20200107T014440Z
Accept-Encoding: gzip

mcli: <DEBUG> HTTP/1.1 200 OK
Content-Length: 247
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Content-Type: application/xml
Date: Tue, 07 Jan 2020 01:43:50 GMT
Server: Caddy
Server: MinIO/DEVELOPMENT.2019-12-22T23-03-51Z
Vary: Origin
X-Amz-Request-Id: 15E7775BDCB367F9
X-Xss-Protection: 1; mode=block

mcli: <DEBUG> Response Time:  2.7775ms

mcli: <DEBUG> PUT /tmp/large.zip?partNumber=2&uploadId=c87445c7-22d5-4172-9d12-4c24170eb179 HTTP/1.1
Host: minio.toast.cafe
User-Agent: MinIO (linux; amd64) minio-go/v6.0.45 mcli/2019-12-17T23:26:28Z
Content-Length: 134217728
Authorization: AWS4-HMAC-SHA256 Credential=**REDACTED**/20200107/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: UNSIGNED-PAYLOAD
X-Amz-Date: 20200107T014440Z
Accept-Encoding: gzip

mcli: <DEBUG> HTTP/1.1 200 OK
Content-Length: 0
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Date: Tue, 07 Jan 2020 01:43:54 GMT
Etag: "8873dc889c6147398355cca9c18cc681-1"
Server: Caddy
Server: MinIO/DEVELOPMENT.2019-12-22T23-03-51Z
Vary: Origin
X-Amz-Request-Id: 15E7775BDCF1F8F0
X-Xss-Protection: 1; mode=block

mcli: <DEBUG> Response Time:  4.100123093s

mcli: <DEBUG> PUT /tmp/large.zip?partNumber=1&uploadId=c87445c7-22d5-4172-9d12-4c24170eb179 HTTP/1.1
Host: minio.toast.cafe
User-Agent: MinIO (linux; amd64) minio-go/v6.0.45 mcli/2019-12-17T23:26:28Z
Content-Length: 134217728
Authorization: AWS4-HMAC-SHA256 Credential=**REDACTED**/20200107/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: UNSIGNED-PAYLOAD
X-Amz-Date: 20200107T014440Z
Accept-Encoding: gzip

mcli: <DEBUG> HTTP/1.1 200 OK
Content-Length: 0
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Date: Tue, 07 Jan 2020 01:43:55 GMT
Etag: "56ef370a9ee959e7cd9e1541ca2aa14f-1"
Server: Caddy
Server: MinIO/DEVELOPMENT.2019-12-22T23-03-51Z
Vary: Origin
X-Amz-Request-Id: 15E7775BDDFB3685
X-Xss-Protection: 1; mode=block

mcli: <DEBUG> Response Time:  4.572582037s
CosmicToast commented 4 years ago

As a further note, performance is fine when going against the minio server directly (via http://fafnir). There are still no HTTP 100s being returned, though, so I guess that can't be it. It's a bit too long to paste here, so here is a paste of what it looks like against a direct remote.

So to summarize, to the best of my understanding, it is not the lack of 100-Continue, because:

  1. Performance is fine against http:// without it occurring.
  2. Performance is fine using mcli (through the https proxy) without it occurring.
  3. The specification does not explicitly allow for it.
ncw commented 4 years ago

However, based on all of this, my understanding is that 100-Continue should not be used or present in multipart uploads. Am I missing something?

Rclone uses the AWS SDK for its s3 interactions and it is the AWS SDK that sets expect/continue up. So I'd be really surprised if that isn't the proper way of interacting with s3. Note I integration test rclone with aws/digital ocean/wasabi/ceph/alibaba all of which have their own s3 implementations, all of which are happy with expect/continue.

So to summarize, to the best of my understanding, it is not the lack of 100-Continue,

I think that 60 second timeout is the thing to dig into. Are any timeouts in the proxy 60s?

Is the proxy doing reverse DNS on a non-existent name and timing out after 60s?

ncw commented 4 years ago

I had a thought, there is one timer that is 60s in rclone

      --contimeout duration                  Connect timeout (default 1m0s)

Can you try setting that to a different amount and see if it changes things?

If that is the problem then it means that the proxy probably needs to have more simultaneous connections configured.

CosmicToast commented 4 years ago

Rclone uses the AWS SDK for its s3 interactions and it is the AWS SDK that sets expect/continue up.

That's kind of strange, I didn't have any issues using aws-cli.

Can you try setting that to a different amount and see if it changes things?

It changes how much time is taken between uploads of 5M at a time, but overall performance is still ~10x lower than it should be (expected, since the timeouts still happen).

If that is the problem then it means that the proxy probably needs to have more simultaneous connections configured.

The proxy has no connection limit. Caddy does not know how to limit incoming external connections (if it did, it wouldn't be a very useful web/rproxy server, would it?), and defaults (used in this case) to no limits to the connections to the backend when acting as an rproxy. Further, as mentioned, there are no issues when going through the rproxy with aws-cli, s3cmd or minio-client.

So the problem then appears to be rclone thinking it's timing out in the connect phase, then?

ncw commented 4 years ago

So the problem then appears to be rclone thinking it's timing out in the connect phase, then?

It looks like it yet.

I suspect if you used another tool based on the AWS go CLI (which aws-cli isn't - it is python) you'd see the same result.

I just noticed I set the expect continue timeout to the same 60s by default which is defintely a mistake...

Try this which sets the expect continue to 1s

https://beta.rclone.org/branch/v1.50.2-124-gbd5cae21-fix-3835-expect-continue-beta/ (uploaded in 15-30 mins)

If that works properly then it means that the proxy isn't supporting expect/continue properly

CosmicToast commented 4 years ago

If that works properly then it means that the proxy isn't supporting expect/continue properly

This works the same way as setting the connection timeout to 1s - better, but still 10+x slower than it should be. However, when I ran rclone against the server directly via http:// I didn't see any 100 responses either.

I suspect if you used another tool based on the AWS go CLI (which aws-cli isn't - it is python) you'd see the same result.

Are you aware of any other consumers I could check with? It's starting to look like it isn't equivalent to the other SDKs (to avoid saying it's not strictly compliant to the spec).

ncw commented 4 years ago

I added an --expect-continue-timeout flag. I suggest you try with 0 which disables expect/continue.

If --expect-continue-timeout 0 works OK then it sounds to me like we've found the issue - the proxy doesn't support expect/continue properly.

https://beta.rclone.org/branch/v1.50.2-124-g081f88e4-fix-3835-expect-continue-beta/ (uploaded in 15-30 mins)

CosmicToast commented 4 years ago

--expect-continue-timeout=0 works ok; it's still about half the performance I expect (~45MB/s instead of >100) but it's much better than previously.

Based on this:

  1. Is it possible to configure expect-continue-timeout in a config file? It'd be nice if I didn't have to explicitly specify it in the command line.
  2. Let's put a pin on this until caddy2 is released (they're in the beta stage right now), at which point I'll try to write a minimal case to reproduce this.

re: part 2 of 2. - do you have any idea of a good way to demonstrate the faulty behavior in a maximally minimal way?

ncw commented 4 years ago

I've merged this to master now which means it will be in the latest beta in 15-30 mins and released in v1.51

Is it possible to configure expect-continue-timeout in a config file? It'd be nice if I didn't have to explicitly specify it in the command line.

Not currently... You can set an environment variable though RCLONE_EXPECT_CONTINUE_TIMEOUT though.

t's put a pin on this until caddy2 is released (they're in the beta stage right now), at which point I'll try to write a minimal case to reproduce this.

OK - when caddy2 comes out then please retest and re-open if necessary!

re: part 2 of 2. - do you have any idea of a good way to demonstrate the faulty behavior in a maximally minimal way?

Probably varying --expect-continue-timeout and measuring the performance is a pretty good way.