minio / mc

Simple | Fast tool to manage MinIO clusters :cloud:
https://min.io/download
GNU Affero General Public License v3.0
2.83k stars 541 forks source link

Memory leak when mirroring older minio servers. #3990

Closed chrisbecke closed 2 years ago

chrisbecke commented 2 years ago

Expected behavior

The container version of minio/mc should use the available S3 APIs and mirror minio servers no matter their age.

Actual behavior

The containerized version of mc eventially consumes all memory and is OOM killed by docker.

Steps to reproduce the behavior

I am running an older minio server as a source server. It was installed circa 2020 and for regulatory reasons has not been updated. It is a simple minio in front of a single disk setup. The target minio instance is a 4 node cluster that is up-to-date.

I run mc as a docker container on docker swarm using this stack file:

version: "3.9"

configs:
  mc-2:
    file: .mc/config.json

networks:
  net:

services:
  sync-zfp-iob:
    image: minio/mc:latest
    command: >
      mirror
      --watch
      --monitoring-address=0.0.0.0:8080
      --quiet
      zfp/downloads
      iob/downloads
    networks:
      net:
    configs:
      - source: mc-2
        target: /root/.mc/config.json
    deploy:
      resources:
        limits:
          cpus: "0.2"
          memory: 1G
      restart_policy:
        max_attempts: 1
  1. Prepar a .mc/config.json which contains an iob config pointing to a new minio instance and a zfp config pointing to an older - circa 2020 - minio instance. Have a "downloads" bucket on the old instance you want to mirror to the new instance.
  2. docker stack deploy --compose-file sync-stack.yml sync
  3. wait 30 minutes
  4. docker service ps sync_sync-zfp-iob
  5. It should be oom killed by docker by now.

mc --version

System information

Additional Info

Without the addition of --watch the command performs the mirror operation pretty flawlesslessly, but we need to mirror new content as it arrives. At any rate, if this operation is not supported, mc should fail outright, or log errors to the prometheus endpoint, NOT leak memory and die.

harshavardhana commented 2 years ago

mc uses GOMAXPROCS to run in parallel so it can potentially use memory you need two things to set - change GOMAXPROCS to make sure it uses less concurrency.

With --watch the issue is that the memory that is used to upload is never GC'ed fast enough. So you hit the upper GC overhead as well.

So you need to reduce the concurrency such that memory is relinquished cleanly with GOGC=25 and then set GOMAXPROCS=4

However

chrisbecke commented 2 years ago

This seems to have something to do with minio 2020-04-04T05:39:31Z not supporting something that mc mirror --watch is expecting - but I thought that mc mirror just needed/used standard s3 APIs, so even older builds of minio should be ok.

For reference, the minio sync service logs look like this - (ill grab a representative 60 seconds of doing nothing (no actual file activity happend in this window)) :-

2022-02-18T08:55:00.141668997Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:00.141730758Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:01.767139268Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:01.767250727Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:04.042691889Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:04.042829964Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:05.941626276Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:05.941699457Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:06.947228356Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:06.947229275Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:09.082395110Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:09.082423831Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:09.767331111Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:09.767337553Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:10.856295436Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:10.856351903Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:11.644065557Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:11.644516689Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:12.351528334Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:12.351552511Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:13.942567778Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:13.942628975Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:14.850079133Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:14.850083456Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:16.770653621Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:16.771041984Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:17.843605138Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:17.843646359Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:20.249291373Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:20.249445534Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:20.944170989Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:20.944229624Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:23.650699315Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:23.651225094Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:24.947937004Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:24.949050569Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:26.478454725Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:26.478692505Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:28.544103154Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:28.544121813Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:30.049961470Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:30.050083840Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:31.645684762Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:31.645689735Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:32.051331128Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:32.051388222Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:33.044271989Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:33.044355314Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:34.442582916Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:34.442653024Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:37.254660208Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:37.254732881Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:38.851692058Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:38.851763478Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:40.957754980Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:40.957820345Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:43.944654316Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:43.944743319Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:47.244371510Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:47.251575961Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:49.542683427Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:49.542768229Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:51.242130083Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:51.242131415Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:53.146698776Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:53.146784707Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:54.562907966Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:54.563147100Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:56.250136463Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:56.250506014Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:58.249917849Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:58.249983205Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
2022-02-18T08:55:59.054935909Z mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
2022-02-18T08:55:59.056628406Z Total: 0 B, Transferred: 0 B, Speed: 0 B/s
harshavardhana commented 2 years ago

This seems to have something to do with minio 2020-04-04T05:39:31Z not supporting something that mc mirror --watch is expecting - but I thought that mc mirror just needed/used standard s3 APIs, so even older builds of minio should be ok.

That is strange @chrisbecke can you provide --watch --debug output ?

chrisbecke commented 2 years ago

With --debug the log is seemingly full of these three entries repeated - Theres lots so I trimmed down to this :-

mc: <DEBUG> GET /downloads/?events=s3%3AObjectCreated%3A%2A&events=s3%3AObjectRemoved%3A%2A&events=s3%3ABucketCreated%3A%2A&events=s3%3ABucketRemoved%3A%2A&prefix=&suffix= HTTP/1.1
Host: der2219.mgsops.net:9000
User-Agent: MinIO (linux; amd64) minio-go/v7.0.22 mc/RELEASE.2022-02-16T05-54-01Z
Authorization: AWS4-HMAC-SHA256 Credential=**REDACTED**/20220218/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20220218T110642Z
Accept-Encoding: gzip

mc: <DEBUG> HTTP/1.1 400 Bad Request
Content-Length: 312
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Content-Type: application/xml
Date: Fri, 18 Feb 2022 11:06:42 GMT
Server: MinIO/RELEASE.2020-04-04T05-39-31Z
Vary: Origin
X-Amz-Request-Id: 16D4DCADC794D224
X-Xss-Protection: 1; mode=block

<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>InvalidArgument</Code><Message>A specified event is not supported for notifications.</Message><BucketName>downloads</BucketName><Resource>/downloads/</Resource><RequestId>16D4DCADC794D224</RequestId><HostId>7e2c867a-ebf6-4d79-849a-b1769e9217f9</HostId></Error>mc: <DEBUG> Response Time:  187.189441ms

mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
 (1) mirror-main.go:500 cmd.(*mirrorJob).monitorMirrorStatus(..)
 (0) client-s3.go:796 cmd.(*S3Client).Watch.func1(..)
 Release-Tag:RELEASE.2022-02-16T05-54-01Z | Commit:220f7c217a8f | Host:447595c11a8f | OS:linux | Arch:amd64 | Lang:go1.17.6 | Mem:57 MB/112 MB | Heap:57 MB/99 MB
Total: 0 B, Transferred: 0 B, Speed: 0 B/s

mc: <DEBUG> GET /gtp-downloads/?delimiter=&encoding-type=url&fetch-owner=true&list-type=2&metadata=true&prefix= HTTP/1.1
Host: s3.iob.gametech.mgsops.net
User-Agent: MinIO (linux; amd64) minio-go/v7.0.22 mc/RELEASE.2022-02-16T05-54-01Z
Authorization: AWS4-HMAC-SHA256 Credential=**REDACTED**/20220218/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20220218T110642Z
Accept-Encoding: gzip

mc: <DEBUG> HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Security-Policy: block-all-mixed-content
Content-Type: application/xml
Date: Fri, 18 Feb 2022 11:06:43 GMT
Server: MinIO
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin
Vary: Accept-Encoding
X-Amz-Request-Id: 16D4DCADC252B650
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block

mc: <DEBUG> Response Time:  361.508262ms

mc: <DEBUG> GET /downloads/?delimiter=&encoding-type=url&fetch-owner=true&list-type=2&metadata=true&prefix= HTTP/1.1
Host: der2219.mgsops.net:9000
User-Agent: MinIO (linux; amd64) minio-go/v7.0.22 mc/RELEASE.2022-02-16T05-54-01Z
Authorization: AWS4-HMAC-SHA256 Credential=**REDACTED**/20220218/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20220218T110642Z
Accept-Encoding: gzip

mc: <DEBUG> HTTP/1.1 200 OK
Content-Length: 292333
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Content-Type: application/xml
Date: Fri, 18 Feb 2022 11:06:43 GMT
Server: MinIO/RELEASE.2020-04-04T05-39-31Z
Vary: Origin
X-Amz-Request-Id: 16D4DCADC79B04DC
X-Xss-Protection: 1; mode=block

mc: <DEBUG> Response Time:  540.738638ms
harshavardhana commented 2 years ago

With --debug the log is seemingly full of these three entries repeated - Theres lots so I trimmed down to this :-

mc: <DEBUG> GET /downloads/?events=s3%3AObjectCreated%3A%2A&events=s3%3AObjectRemoved%3A%2A&events=s3%3ABucketCreated%3A%2A&events=s3%3ABucketRemoved%3A%2A&prefix=&suffix= HTTP/1.1
Host: der2219.mgsops.net:9000
User-Agent: MinIO (linux; amd64) minio-go/v7.0.22 mc/RELEASE.2022-02-16T05-54-01Z
Authorization: AWS4-HMAC-SHA256 Credential=**REDACTED**/20220218/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20220218T110642Z
Accept-Encoding: gzip

mc: <DEBUG> HTTP/1.1 400 Bad Request
Content-Length: 312
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Content-Type: application/xml
Date: Fri, 18 Feb 2022 11:06:42 GMT
Server: MinIO/RELEASE.2020-04-04T05-39-31Z
Vary: Origin
X-Amz-Request-Id: 16D4DCADC794D224
X-Xss-Protection: 1; mode=block

<?xml version="1.0" encoding="UTF-8"?>
<Error><Code>InvalidArgument</Code><Message>A specified event is not supported for notifications.</Message><BucketName>downloads</BucketName><Resource>/downloads/</Resource><RequestId>16D4DCADC794D224</RequestId><HostId>7e2c867a-ebf6-4d79-849a-b1769e9217f9</HostId></Error>mc: <DEBUG> Response Time:  187.189441ms

mc: <ERROR> Failed to perform mirroring A specified event is not supported for notifications.
 (1) mirror-main.go:500 cmd.(*mirrorJob).monitorMirrorStatus(..)
 (0) client-s3.go:796 cmd.(*S3Client).Watch.func1(..)
 Release-Tag:RELEASE.2022-02-16T05-54-01Z | Commit:220f7c217a8f | Host:447595c11a8f | OS:linux | Arch:amd64 | Lang:go1.17.6 | Mem:57 MB/112 MB | Heap:57 MB/99 MB
Total: 0 B, Transferred: 0 B, Speed: 0 B/s

mc: <DEBUG> GET /gtp-downloads/?delimiter=&encoding-type=url&fetch-owner=true&list-type=2&metadata=true&prefix= HTTP/1.1
Host: s3.iob.gametech.mgsops.net
User-Agent: MinIO (linux; amd64) minio-go/v7.0.22 mc/RELEASE.2022-02-16T05-54-01Z
Authorization: AWS4-HMAC-SHA256 Credential=**REDACTED**/20220218/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20220218T110642Z
Accept-Encoding: gzip

mc: <DEBUG> HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Security-Policy: block-all-mixed-content
Content-Type: application/xml
Date: Fri, 18 Feb 2022 11:06:43 GMT
Server: MinIO
Strict-Transport-Security: max-age=31536000; includeSubDomains
Vary: Origin
Vary: Accept-Encoding
X-Amz-Request-Id: 16D4DCADC252B650
X-Content-Type-Options: nosniff
X-Xss-Protection: 1; mode=block

mc: <DEBUG> Response Time:  361.508262ms

mc: <DEBUG> GET /downloads/?delimiter=&encoding-type=url&fetch-owner=true&list-type=2&metadata=true&prefix= HTTP/1.1
Host: der2219.mgsops.net:9000
User-Agent: MinIO (linux; amd64) minio-go/v7.0.22 mc/RELEASE.2022-02-16T05-54-01Z
Authorization: AWS4-HMAC-SHA256 Credential=**REDACTED**/20220218/us-east-1/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=**REDACTED**
X-Amz-Content-Sha256: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
X-Amz-Date: 20220218T110642Z
Accept-Encoding: gzip

mc: <DEBUG> HTTP/1.1 200 OK
Content-Length: 292333
Accept-Ranges: bytes
Content-Security-Policy: block-all-mixed-content
Content-Type: application/xml
Date: Fri, 18 Feb 2022 11:06:43 GMT
Server: MinIO/RELEASE.2020-04-04T05-39-31Z
Vary: Origin
X-Amz-Request-Id: 16D4DCADC79B04DC
X-Xss-Protection: 1; mode=block

mc: <DEBUG> Response Time:  540.738638ms

Ah this is because the API has changed and latest mc doesn't know what it should do.

However this shouldn't cause such high level of stack growth.

harshavardhana commented 2 years ago

@chrisbecke how long does this take to error out?

chrisbecke commented 2 years ago

About 15 minutes.

For reference, the red line is another service that is doing a mirror watch from a minio:latest instance, to AWS S3.

Screenshot 2022-02-21 at 07 43 12

harshavardhana commented 2 years ago

For reference, the red line is another service that is doing a mirror watch from a minio:latest instance, to AWS S3.

I tested this locally i can't see any growth in memory at all - there is tiny growth but it happens are a really really long time.

can you reproduce this with mc mirror --watch old-minio/bucket /some/local/path - is it only reproducible with minio to minio or minio to s3 ?

chrisbecke commented 2 years ago

Doing a mirror against /some/local/path grows just as fast. However, using a smaller source bucket sees the growth slow down substantially.

harshavardhana commented 2 years ago

Doing a mirror against /some/local/path grows just as fast. However, using a smaller source bucket sees the growth slow down substantially.

Ah I see it is, not just the failure of event notification watch - it is the build up due to Listobjects() @vadmeste

vadmeste commented 2 years ago

Ah I see it is, not just the failure of event notification watch - it is the build up due to Listobjects() @vadmeste

yes, I am sending a serie of PRs to fix that.

harshavardhana commented 2 years ago

This should be fixed in the latest release, closing this issue.

Vane-L commented 1 year ago

这应该在最新版本中修复,关闭此问题。

could you give me the fix commit, thanks @harshavardhana