Closed voidus closed 1 year ago
@voidus Upgrade. Your logs show RELEASE.2022-09-01T23-53-36Z
, not RELEASE.2022-11-26T22-43-32Z
. This bug was fixed in more recent versions.
Still happening with RELEASE.2022-11-29T23-40-49Z in our test suite, I'll make sure it reproduces with just docker run and post the log here
Did a docker system prune --volumes
followed by docker system prune -a
and ran again:
> docker run --rm -ti quay.io/minio/minio server /data --console-address :9001
Formatting 1st pool, 1 set(s), 1 drives per set.
WARNING: Host local has more than 0 drives of set. A host failure will result in data becoming unavailable.
Warning: Default parity set to 0. This can lead to data loss.
WARNING: Detected default credentials 'minioadmin:minioadmin', we recommend that you change these values with 'MINIO_ROOT_USER' and 'MINIO_ROOT_PASSWORD' environment variables
MinIO Object Storage Server
Copyright: 2015-2022 MinIO, Inc.
License: GNU AGPLv3 <https://www.gnu.org/licenses/agpl-3.0.html>
Version: RELEASE.2022-11-29T23-40-49Z (go1.19.3 linux/amd64)
Status: 1 Online, 0 Offline.
API: http://172.17.0.2:9000 http://127.0.0.1:9000
RootUser: minioadmin
RootPass: minioadmin
Console: http://172.17.0.2:9001 http://127.0.0.1:9001
RootUser: minioadmin
RootPass: minioadmin
Command-line: https://min.io/docs/minio/linux/reference/minio-mc.html#quickstart
$ mc alias set myminio http://172.17.0.2:9000 minioadmin minioadmin
Documentation: https://min.io/docs/minio/linux/index.html
API: SYSTEM()
Time: 11:50:22 UTC 12/02/2022
DeploymentID: 82cf9dc5-122c-49a6-801b-fc760ead88e6
Error: readObjectStart: expect { or n, but found , error found in #1 byte of ...||..., bigger context ...||... (*errors.errorString)
5: internal/logger/logger.go:258:logger.LogIf()
4: cmd/xl-storage.go:2262:cmd.(*xlStorage).RenameData()
3: cmd/xl-storage-disk-id-check.go:374:cmd.(*xlStorageDiskIDCheck).RenameData()
2: cmd/erasure-object.go:762:cmd.renameData.func1()
1: internal/sync/errgroup/errgroup.go:123:errgroup.(*Group).Go.func1()
API: SYSTEM()
Time: 11:50:22 UTC 12/02/2022
DeploymentID: 82cf9dc5-122c-49a6-801b-fc760ead88e6
Error: readObjectStart: expect { or n, but found , error found in #1 byte of ...||..., bigger context ...||... (*errors.errorString)
5: internal/logger/logger.go:258:logger.LogIf()
4: cmd/xl-storage.go:2262:cmd.(*xlStorage).RenameData()
3: cmd/xl-storage-disk-id-check.go:374:cmd.(*xlStorageDiskIDCheck).RenameData()
2: cmd/erasure-object.go:762:cmd.renameData.func1()
1: internal/sync/errgroup/errgroup.go:123:errgroup.(*Group).Go.func1()
API: SYSTEM()
Time: 11:51:22 UTC 12/02/2022
DeploymentID: 82cf9dc5-122c-49a6-801b-fc760ead88e6
Error: readObjectStart: expect { or n, but found , error found in #1 byte of ...||..., bigger context ...||... (*errors.errorString)
5: internal/logger/logger.go:258:logger.LogIf()
4: cmd/xl-storage.go:2262:cmd.(*xlStorage).RenameData()
3: cmd/xl-storage-disk-id-check.go:374:cmd.(*xlStorageDiskIDCheck).RenameData()
2: cmd/erasure-object.go:762:cmd.renameData.func1()
1: internal/sync/errgroup/errgroup.go:123:errgroup.(*Group).Go.func1()
API: SYSTEM()
Time: 11:51:22 UTC 12/02/2022
DeploymentID: 82cf9dc5-122c-49a6-801b-fc760ead88e6
Error: readObjectStart: expect { or n, but found , error found in #1 byte of ...||..., bigger context ...||... (*errors.errorString)
5: internal/logger/logger.go:258:logger.LogIf()
4: cmd/xl-storage.go:2262:cmd.(*xlStorage).RenameData()
3: cmd/xl-storage-disk-id-check.go:374:cmd.(*xlStorageDiskIDCheck).RenameData()
2: cmd/erasure-object.go:762:cmd.renameData.func1()
1: internal/sync/errgroup/errgroup.go:123:errgroup.(*Group).Go.func1()
^CExiting on signal: INTERRUPT
As I wrote, any pointers how to debug this further would be appreciated. There are no requests hitting this, this is an unedited log.
This log doesn't mention it, but I'm seeing the line mentioned in the issue title with out test logs. I can run the plain container a bit longer if you think that might be insightful.
Likely corrupted objects on disk. Should be fixable with healing.
I'm not exactly sure how this applies here. This is a fresh docker container without volume or port mappings, and it happens pretty reliably, would there be a reason why the objects on my disk get corrupted after a few minutes with zero requests sent to the server?
It is because your backend disk is faking writes and not persistent. You need to provide real disks with real filesystem for the data partition.
The overlay filesystem perhaps under use is loosing data sequences.
You should start the container via docker run -v /my-local-path:/data minio/minio server /data
if you are using some kind of overlayed FS then it can fake POSIX semantics and can lose data in between.
Nothing to do with MinIO here just a setup issue on your end.
Could you please elaborate or drop a relevant link to "losing data sequences"? I'm not familiar with that term and searching didn't bring up anything.
I looked into journalctl -k
and found occasional i2c i2c-2: sendbytes: NAK bailout.
lines, but I don't think this is relevant? Feel free to suggest otherwise. I'm using a SATA Samsung SSD 860 EVO 1TB formatted with f2fs.
I'm still having the issue with out docker-compose stuff. I've been trying more stuff
# no errors with this (yet)
> docker run --rm -ti -v "$(mktemp -d):/data" quay.io/minio/minio server /data --console-address :9001
# <output omitted>
# this produces an error, not sure if it's the same one
> docker run --rm -ti -v "foo:/data" quay.io/minio/minio server /data --console-address :9001
Formatting 1st pool, 1 set(s), 1 drives per set.
WARNING: Host local has more than 0 drives of set. A host failure will result in data becoming unavailable.
Warning: Default parity set to 0. This can lead to data loss.
WARNING: Detected default credentials 'minioadmin:minioadmin', we recommend that you change these values with 'MINIO_ROOT_USER' and 'MINIO_ROOT_PASSWORD' environment variables
MinIO Object Storage Server
Copyright: 2015-2022 MinIO, Inc.
License: GNU AGPLv3 <https://www.gnu.org/licenses/agpl-3.0.html>
Version: RELEASE.2022-11-29T23-40-49Z (go1.19.3 linux/amd64)
Status: 1 Online, 0 Offline.
API: http://172.17.0.3:9000 http://127.0.0.1:9000
RootUser: minioadmin
RootPass: minioadmin
Console: http://172.17.0.3:9001 http://127.0.0.1:9001
RootUser: minioadmin
RootPass: minioadmin
Command-line: https://min.io/docs/minio/linux/reference/minio-mc.html#quickstart
$ mc alias set myminio http://172.17.0.3:9000 minioadmin minioadmin
Documentation: https://min.io/docs/minio/linux/index.html
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ You are running an older version of MinIO released 2 days ago ┃
┃ Update: Run `mc admin update` ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛
^[[B
API: SYSTEM()
Time: 16:29:30 UTC 12/06/2022
DeploymentID: 709f69d3-18fd-4d76-8ded-61c1b0fd1977
Error: readObjectStart: expect { or n, but found , error found in #1 byte of ...||..., bigger context ...||... (*errors.errorString)
5: internal/logger/logger.go:258:logger.LogIf()
4: cmd/xl-storage.go:2262:cmd.(*xlStorage).RenameData()
3: cmd/xl-storage-disk-id-check.go:374:cmd.(*xlStorageDiskIDCheck).RenameData()
2: cmd/erasure-object.go:762:cmd.renameData.func1()
1: internal/sync/errgroup/errgroup.go:123:errgroup.(*Group).Go.func1()
<repeats>
Here's the relevant docker inspect
output
> docker inspect dazzling_mclean | jq '.[0].Mounts'
[
{
"Type": "bind",
"Source": "/tmp/nix-shell.4lT9K6/tmp.AGR0vjPqIJ",
"Destination": "/data",
"Mode": "",
"RW": true,
"Propagation": "rprivate"
}
]
> docker exec dazzling_mclean mount | grep 'on /data'
tmpfs on /data type tmpfs (rw,nosuid,nodev,nr_inodes=1048576,inode64)
> docker inspect youthful_napier | jq '.[0].Mounts'
[
{
"Type": "volume",
"Name": "foo",
"Source": "/var/lib/docker/volumes/foo/_data",
"Destination": "/data",
"Driver": "local",
"Mode": "z",
"RW": true,
"Propagation": ""
}
]
> docker exec youthful_napier mount | grep 'on /data'
/dev/mapper/vg-root on /data type f2fs (rw,noatime,lazytime,background_gc=on,nodiscard,no_heap,user_xattr,inline_xattr,acl,inline_data,inline_dentry,flush_merge,extent_cache,mode=adaptive,active_logs=6,alloc_mode=default,checkpoint_merge,fsync_mode=posix,discard_unit=block,memory=normal)
This makes me think that it's an f2fs issue, since it's working on tmpfs. I'll try to find out more.
I looked at disk SMART data and ran a short self test, everything looks peachy. Running linux 6.0.8-arch1-1 here, basically stock arch linux.
Also the docker image page is suggesting to run it without a persistent volume. Is that a documentation bug? (I'm assuming podman and docker do the same thing with overlays, which is probably wrong, but since it's on dockerhub, if running on docker without -v
is unsupported, it should be clearly stated)
From the f2fs docs:
fsync_mode=%s Control the policy of fsync. Currently supports "posix",
"strict", and "nobarrier". In "posix" mode, which is
default, fsync will follow POSIX semantics and does a
light operation to improve the filesystem performance.
In "strict" mode, fsync will be heavy and behaves in line
with xfs, ext4 and btrfs, where xfstest generic/342 will
pass, but the performance will regress. "nobarrier" is
based on "posix", but doesn't issue flush command for
non-atomic files likewise "nobarrier" mount option.
Note that I'm using posix. Please re-open, it is not the setup problem that you hinted at.
@harshavardhana please reopen the issue, using a volume doesn't change anything.
I've hit the same problem but with the .minio.sys/pool.bin
file being reported as "file corrupted".
I'm also on f2fs
. Changing fsync_mode
from the implitic default (posix
) to strict
and starting from scratch (empty MinIO data/configuration) fixed the problem.
We cannot stop you from shooting yourself in the foot. We recommend xfs only for a reason - anything else is at your own risk.
I'm using minio as an object store for e2e tests using docker-compose, but I could minize the repro case quite a bit
It's important to note that my tests seem to run fine though, so I'm not sure what this is actually affecting.
When running minio with docker on my local machine (arch linux with some nix packages) I get the error in the summary after a while, even if I didn't interact with the service.
This only seems to affect me, so I guess this is more of a debugging thing. Since minio only logs errors, I can't see any way forward here. I still wanted to report it though because searching the internet turned up literally nothing for this.
Expected Behavior
Nothing, basically.
Current Behavior
Steps to Reproduce (for bugs)
docker run --rm -ti quay.io/minio/minio server /data --console-address :9001
Context
Corrupted
/data/.minio.sys/buckets/.usage.json/xl.meta
(only file in that directory) from another run:Your Environment
minio --version
):uname -a
):Linux boop 6.0.8-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 10 Nov 2022 21:14:24 +0000 x86_64 GNU/Linux