minio / mc

Unix like utilities for object store
https://min.io/download
GNU Affero General Public License v3.0
2.86k stars 548 forks source link

mc admin update is extremly slow since version RELEASE.2024-01-28T16-23-14Z (server is already running latest version) #4980

Closed spranta-devops closed 4 weeks ago

spranta-devops commented 3 months ago

Expected behavior

When minio server is running the latest version, then calling mc admin update <ALIAS> should return in a seconds (or a few seconds).

Here is the (sanitized) debug output with the last fast working mc version (from mcli_20240118070339.0.0_amd64.deb):

time mc admin update minio01 --debug
You are about to upgrade *MinIO Server*, please confirm [y/N]: y
mc: <DEBUG> POST /minio/admin/v3/update?updateURL= HTTP/1.1
Host: minio01.sanitized.com
User-Agent: MinIO (linux; amd64) madmin-go/2.0.0 mc/RELEASE.2024-01-18T07-03-39Z
Transfer-Encoding: chunked
Accept-Encoding: zstd,gzip
Authorization: sanitized
X-Amz-Content-Sha256: sanitized
X-Amz-Date: 20240708T074246Z

mc: <DEBUG> HTTP/1.1 200 OK
Content-Length: 81
Accept-Ranges: bytes
Content-Type: application/json
Date: Mon, 08 Jul 2024 07:42:46 GMT
Server: MinIO
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin
Vary: Accept-Encoding
X-Amz-Bucket-Region: sanitized
X-Amz-Id-2: sanitized
X-Amz-Request-Id: sanitized
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block

Server `minio01` already running the most recent version 2024-07-04T14:25:45Z of MinIO

real    0m0.811s
user    0m0.055s
sys     0m0.026s

Actual behavior

It needs about 2m30s to return. I had to increase the server timeout to prevent a gateway timeout error from the load balancer (HAProxy). Or use a minio node directly as alias, without using the load balancer.

The (sanitized) debug output:

time mc admin update minio01 --debug
You are about to upgrade *MinIO Server*, please confirm [y/N]: y
mc: <DEBUG> POST /minio/admin/v3/update?dry-run=false&type=2&updateURL= HTTP/1.1
Host: minio01.sanitized.com
User-Agent: MinIO (linux; amd64) madmin-go/2.0.0 mc/RELEASE.2024-01-28T16-23-14Z
Transfer-Encoding: chunked
Accept-Encoding: zstd,gzip
Authorization: sanitized
X-Amz-Content-Sha256: sanitized
X-Amz-Date: 20240709T050957Z

mc: <DEBUG> HTTP/1.1 200 OK
Content-Length: 963
Accept-Ranges: bytes
Content-Type: application/json
Date: Tue, 09 Jul 2024 05:12:26 GMT
Server: MinIO
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin
Vary: Accept-Encoding
X-Amz-Bucket-Region: sanitized
X-Amz-Id-2: sanitized
X-Amz-Request-Id: sanitized
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block

Server update request sent successfully `minio01`
┌──────────────────────────────────────────┬────────────────────────────────────────────────────────────────────┐
│ HOST                                     │ STATUS                                                             │
├──────────────────────────────────────────┼────────────────────────────────────────────────────────────────────┤
│ minio01-node02.sanitized.com:9000 │ server is already running the latest version: 2024-07-04T14:25:45Z │
│ minio01-node01.sanitized.com:9000 │ server is already running the latest version: 2024-07-04T14:25:45Z │
│ minio01-node03.sanitized.com:9000 │ server is already running the latest version: 2024-07-04T14:25:45Z │
│ minio01-node04.sanitized.com:9000 │ server is already running the latest version: 2024-07-04T14:25:45Z │
│ minio01-node05.sanitized.com:9000 │ server is already running the latest version: 2024-07-04T14:25:45Z │
└──────────────────────────────────────────┴────────────────────────────────────────────────────────────────────┘

real    2m31.105s
user    0m0.075s
sys     0m0.038s

Steps to reproduce the behavior

Run mc admin update <ALIAS> against a server that is already running the latest version. (2024-07-04T14:25:45Z at the moment)

For testing set the ALIAS to a node and not the load balancer URL. This is easier to set up.

mc --version

mc version RELEASE.2024-07-08T20-59-24Z (commit-id=21d3ec0089a1fa297cbdc23db413012535e2ff9e) Runtime: go1.22.5 linux/amd64 Copyright (c) 2015-2024 MinIO, Inc. License GNU AGPLv3 https://www.gnu.org/licenses/agpl-3.0.html

System information

Running mc on a Ubuntu VM

dhananjaykrutika commented 4 weeks ago

This seems to be related to the introduction of a newer implementation of ServerUpdate back in Jan 2024, namely ServerUpdateV2Handler. Even on a single-node setup, it is taking me around 5min45s for mc admin update to conclude that it is running the latest version. Upon adding a few extra logs and running the same tests, I see that the downloadBinary() step alone takes around 5mins40s as the binary is about 100MB in size.

@harshavardhana , can this update not be done in two steps , where:

That way, if all servers are found to be running the latest version, originator server needn't download the binary and send it over the network to all peers to do the update.

harshavardhana commented 4 weeks ago
mc admin update

Is never meant to be faster @dhananjaykrutika the issue is downloading from dl.min.io is very slow since we throttled by our download server.

The ideal scenario that customers use is they setup a local mirror and then perform mc admin update http://local-mirror/minio.sha256sum

Or just download the binary copy it to all the nodes and then mc admin service restart alias/

That way, if all servers are found to be running the latest version, originator server needn't download the binary and send it over the network to all peers to do the update.

well that is ServerUpdateV1 was, the issue is that downloading from 10+ servers is much slower than downloading from 1 server. If you factor in the throttling.

For example we have 500+ node deployments, they will never finish updating. However the issue here is not downloading the binary itself, it is something to do with we are downloading the binary even when its the latest version which we should never do.

The bug must be fixed on the server side.