minio / mc

Simple | Fast tool to manage MinIO clusters :cloud:
https://min.io/download
GNU Affero General Public License v3.0
2.84k stars 541 forks source link

Too much concurrency leads to server failure with status 408 #3864

Closed wanghanyuxi closed 2 years ago

wanghanyuxi commented 2 years ago

Hello, I am trying to use 'mc cp' to copy files from local to server. Local bandwidth is 2MB/s, but concurrency of mc can reach to 40+. Slow single connection leads to server failure with status 408. Still not getting better even change the concurrency to 1 for mutiPartPut with 'export MC_UPLOAD_MULTIPART_THREADS=1'.

Expected behavior

copy success with status 200 of server

Actual behavior

some request return 408.

Steps to reproduce the behavior

1、Local bandwidth is slow and unstable. 2、using ‘mc cp’ to copy local files to server.

mc --version

mc version RELEASE.2021-11-05T10-05-06Z

System information

wanghanyuxi commented 2 years ago

May be it is helpful to allow user to change 'maxParallelWorkers' with env. The mechanism of dynamic increase of workers may not be suitable for low-speed and unstable network environment.

harshavardhana commented 2 years ago

@wanghanyuxi mc retries such errors can you provide --debug information?

May be it is helpful to allow user to change 'maxParallelWorkers' with env.

You can set GOMAXPROCS=8

wanghanyuxi commented 2 years ago

c: PUT /xxx-beijing-test-12/data/big-xxx-test-data-1?partNumber=2&uploadId=aa5e17a72f1f4a7aa57421f7bdf4f9e4 HTTP/1.1 Host: xxx.com User-Agent: MinIO (windows; amd64) minio-go/v7.0.15 mc/DEVELOPMENT.GOGET Content-Length: 16777216 Authorization: AWS REDACTED:REDACTED Date: Sun, 14 Nov 2021 03:57:13 GMT Accept-Encoding: gzip

mc: HTTP/1.1 408 Request Timeout Transfer-Encoding: chunked Connection: keep-alive Content-Type: application/xml Date: Sun, 14 Nov 2021 04:01:42 GMT Server: Tengine X-Application-Context: application X-Kss-Request-Id: 9867e53690b94f7aaf42adcc293f594f

14e <?xml version="1.0" encoding="UTF-8" standalone="yes"?>RequestTimeoutClient Request Timeout/xxx-beijing-test-12/data/big-xxx-test-data-1?uploadId=aa5e17a72f1f4a7aa57421f7bdf4f9e4&partNumber=29867e53690b94f7aaf42adcc293f594f 0

mc: Response Time: 1m25.995259s

stackTrace: (3) cp-main.go:579 cmd.doCopySession(..) Tags: [/data/logs/test5] (2) common-methods.go:592 cmd.uploadSourceToTargetURL(..) Tags: [/data/logs/test5] (1) common-methods.go:329 cmd.putTargetStream(..) Tags: [ks3, http://xxx.com/test/test5] (0) client-s3.go:1062 cmd.(*S3Client).Put(..) Release-Tag:RELEASE.2021-11-05T10-05-06Z | Commit:9f6b50014291 | Host:ubuntu | OS:linux | Arch:amd64 | Lang:go1.17.2 | Mem:3.0 GB/3.3 GB | Heap:3.0 GB/3.2 GB mc: Command terminated safely. Run this command to resume copy again.

wanghanyuxi commented 2 years ago

Hi,

Too many '408' leads to failure of 'mc cp'. With config of 'export MC_UPLOAD_MULTIPART_THREADS=1, export GOMAXPROCS=1', the number of 'worker' in mc can be controlled within 10 on my machine. The number of cpu cores of my machine is 8. But With limited bandwidth(2~3MB/s), the number of concurrency 10 may still lead to ‘408’. So, support lower number of startup threads (now is 8, the number of cpu cores) will be better to this case. This case will happen when using 'mc cp' to copy large amount of data with limited bandwidth.

Thank you very much.

harshavardhana commented 2 years ago

This case will happen when using 'mc cp' to copy large amount of data with limited bandwidth.

This is not an issue of that - it's your server that is not handling the load - and it seems to have really low timeouts perhaps for calls.

mc: Response Time: 1m25.995259s

It has an exact 86secs of timeout.