juicedata / juicefs

JuiceFS is a distributed POSIX file system built on top of Redis and S3.
https://juicefs.com
Apache License 2.0
10.87k stars 959 forks source link

WRONGPASS invalid username-password pair or user is disabled in juicefs logs #5136

Closed venera70 closed 2 months ago

venera70 commented 2 months ago

What happened: We perform git clone of a large opensource codebase with hundres of thousands of files into a juicefs filesystem. The checkouts fail towards the end with the following git error: fatal: cannot pread pack file: Input/output error fatal: fetch-pack: invalid index-pack output In the juicefs client logs, there are multiple occurrences of the following:

2024/09/03 15:51:41.986960 juicefs[10329] <ERROR>: error: WRONGPASS invalid username-password pair or user is disabled.
goroutine 16556954 [running]:
runtime/debug.Stack()
    /usr/local/go/src/runtime/debug/stack.go:24 +0x64
github.com/juicedata/juicefs/pkg/meta.errno({0x3ccef20, 0x400296c1d0})
    /home/ubuntu/juicefs/pkg/meta/utils.go:123 +0xa0
github.com/juicedata/juicefs/pkg/meta.(*redisMeta).doRead(0x40018528c0, {0x3d0c418, 0x40014d3040}, 0x40028f1f80?, 0x10?)
    /home/ubuntu/juicefs/pkg/meta/redis.go:2201 +0x98
github.com/juicedata/juicefs/pkg/meta.(*baseMeta).Read(0x400006ca08, {0x3d0c418, 0x40014d3040}, 0xa8fc, 0x0, 0x40028f1c98)
    /home/ubuntu/juicefs/pkg/meta/base.go:1407 +0x224
github.com/juicedata/juicefs/pkg/vfs.(*sliceReader).run(0x4000db8480)
    /home/ubuntu/juicefs/pkg/vfs/reader.go:173 +0x1d0
created by github.com/juicedata/juicefs/pkg/vfs.(*fileReader).newSlice in goroutine 16556949
    /home/ubuntu/juicefs/pkg/vfs/reader.go:327 +0x2b8 [utils.go:123]
2024/09/03 15:51:41.987871 juicefs[10329] <ERROR>: error: WRONGPASS invalid username-password pair or user is disabled.
goroutine 16556959 [running]:
runtime/debug.Stack()
    /usr/local/go/src/runtime/debug/stack.go:24 +0x64
github.com/juicedata/juicefs/pkg/meta.errno({0x3ccef20, 0x400296c290})
    /home/ubuntu/juicefs/pkg/meta/utils.go:123 +0xa0
github.com/juicedata/juicefs/pkg/meta.(*redisMeta).doRead(0x40018528c0, {0x3d0c418, 0x40014d3040}, 0x40028f1f80?, 0x10?)
    /home/ubuntu/juicefs/pkg/meta/redis.go:2201 +0x98
github.com/juicedata/juicefs/pkg/meta.(*baseMeta).Read(0x400006ca08, {0x3d0c418, 0x40014d3040}, 0xa8fc, 0x0, 0x40028f1db8)
    /home/ubuntu/juicefs/pkg/meta/base.go:1407 +0x224
github.com/juicedata/juicefs/pkg/vfs.(*sliceReader).run(0x4000db8720)
    /home/ubuntu/juicefs/pkg/vfs/reader.go:173 +0x1d0
created by github.com/juicedata/juicefs/pkg/vfs.(*fileReader).newSlice in goroutine 1
    /home/ubuntu/juicefs/pkg/vfs/reader.go:327 +0x2b8 [utils.go:123]
2024/09/03 15:51:43.157841 juicefs[10329] <ERROR>: error: WRONGPASS invalid username-password pair or user is disabled.
goroutine 16558146 [running]:
runtime/debug.Stack()
    /usr/local/go/src/runtime/debug/stack.go:24 +0x64
github.com/juicedata/juicefs/pkg/meta.errno({0x3ccef20, 0x4000cdf310})
    /home/ubuntu/juicefs/pkg/meta/utils.go:123 +0xa0
github.com/juicedata/juicefs/pkg/meta.(*redisMeta).doGetAttr(0x40018528c0, {0x3d0d138, 0x40018d05a0}, 0xa8fc, 0x4001366b40)
    /home/ubuntu/juicefs/pkg/meta/redis.go:831 +0x174
github.com/juicedata/juicefs/pkg/meta.(*baseMeta).GetAttr(0x400006ca08, {0x3d0d138, 0x40018d05a0}, 0x2d7b840?, 0x4001366b40)
    /home/ubuntu/juicefs/pkg/meta/base.go:919 +0x254
github.com/juicedata/juicefs/pkg/vfs.(*VFS).GetAttr(0x4001874410, {0x3d0eb80, 0x40018d05a0}, 0xa8fc, 0xe8?)
    /home/ubuntu/juicefs/pkg/vfs/vfs.go:193 +0x104
github.com/juicedata/juicefs/pkg/fuse.(*fileSystem).GetAttr(0x40018b4d60, 0xfffff7fbd108?, 0x4000049a78, 0x40000499e8)
    /home/ubuntu/juicefs/pkg/fuse/fuse.go:101 +0x98
github.com/hanwen/go-fuse/v2/fuse.doGetAttr(0x400319be18?, 0x40000498c8)
    /home/ubuntu/go/pkg/mod/github.com/juicedata/go-fuse/v2@v2.1.1-0.20240425033113-7c40cb5eb3e9/fuse/opcode.go:301 +0x50
github.com/hanwen/go-fuse/v2/fuse.init.0.func1(0x40000498c8?, 0x23e2758?)
    /home/ubuntu/go/pkg/mod/github.com/juicedata/go-fuse/v2@v2.1.1-0.20240425033113-7c40cb5eb3e9/fuse/opcode.go:774 +0x50
github.com/hanwen/go-fuse/v2/fuse.(*Server).handleRequest(0x4000c10d00, 0x40000498c8)
    /home/ubuntu/go/pkg/mod/github.com/juicedata/go-fuse/v2@v2.1.1-0.20240425033113-7c40cb5eb3e9/fuse/server.go:708 +0x218
github.com/hanwen/go-fuse/v2/fuse.(*Server).loop(0x4000c10d00, 0x1)
    /home/ubuntu/go/pkg/mod/github.com/juicedata/go-fuse/v2@v2.1.1-0.20240425033113-7c40cb5eb3e9/fuse/server.go:681 +0xec
created by github.com/hanwen/go-fuse/v2/fuse.(*Server).readRequest in goroutine 16558145
    /home/ubuntu/go/pkg/mod/github.com/juicedata/go-fuse/v2@v2.1.1-0.20240425033113-7c40cb5eb3e9/fuse/server.go:414 +0x6c0 [utils.go:123]
2024/09/03 15:54:40.424246 juicefs[10329] <ERROR>: error: WRONGPASS invalid username-password pair or user is disabled.
goroutine 18947669 [running]:
runtime/debug.Stack()
    /usr/local/go/src/runtime/debug/stack.go:24 +0x64
github.com/juicedata/juicefs/pkg/meta.errno({0x3ccef20, 0x404e57da20})
    /home/ubuntu/juicefs/pkg/meta/utils.go:123 +0xa0
github.com/juicedata/juicefs/pkg/meta.(*redisMeta).doWrite(0x40018528c0, {0x3d0c418, 0x40014d3040}, 0xa8fd, 0x49, 0x0, {0x4003aab624?, 0x2?, 0x0?, 0x487344?}, ...)
    /home/ubuntu/juicefs/pkg/meta/redis.go:2207 +0x1f8
github.com/juicedata/juicefs/pkg/meta.(*baseMeta).Write(0x400006ca08, {0x3d0c418, 0x40014d3040}, 0xa8fd, 0x49, 0x0, {0x404dc57638?, 0x18d0690?, 0x40?, 0xac9f980?}, ...)
    /home/ubuntu/juicefs/pkg/meta/base.go:1475 +0x1c4
github.com/juicedata/juicefs/pkg/vfs.(*chunkWriter).commitThread(0x404d92d9e0)
    /home/ubuntu/juicefs/pkg/vfs/writer.go:200 +0x1dc
created by github.com/juicedata/juicefs/pkg/vfs.(*fileWriter).writeChunk in goroutine 1
    /home/ubuntu/juicefs/pkg/vfs/writer.go:270 +0x414 [utils.go:123]
2024/09/03 15:54:40.424304 juicefs[10329] <WARNING>: write inode:43261 error: input/output error [writer.go:207]
2024/09/03 15:54:40.424319 juicefs[10329] <ERROR>: write inode:43261 indx:73  input/output error [writer.go:211]

The WRONGPASS error seems to indicate that it is related to redis access but the juicefs mountpoint is fully accessible:

dd if=/dev/urandom of=dummy.bin bs=10M count=100
100+0 records in
100+0 records out
1048576000 bytes (1.0 GB, 1000 MiB) copied, 4.89751 s, 214 MB/s

and subsequently head-ing the file shows the binary data written What you expected to happen: Checkout should complete without errors.

How to reproduce it (as minimally and precisely as possible): Clone an OSS project with many small files and long history (e.g. linux kernel, gcc or llvm) into a juicefs system with the mount flags detailed in the environment below.

Anything else we need to know? Not sure how relevant but the EC2 clients that access the juicefs mount will be writing to the same volume but not the same set of files. Each client would write to its own subdirectory in the mount at any given time. Also, juicefs binary is running as root (via sudo)

Environment:

davies commented 2 months ago

The logging suggests that you are using a WRONG password to access Redis.

venera70 commented 2 months ago

If it is a WRONG password for Redis/MemoryDB, then why is it we were still able to list the contents of the mountpoint and even write to it (see the output of dd command above)?

For example, when the MemoryDB password does expire (IAM authentication) we will get:

user@ec2-host:/test$ ls
ls: reading directory '.': Input/output error
user@ec2-host::/test$

but that wasn't the case when those WRONGPASS errors were appearing in the logs.

More on the IAM authentication method for MemoryDB: We are using the IAM user of MemoryDB https://docs.aws.amazon.com/memorydb/latest/devguide/auth-iam.html where by a dynamically generated IAM token is used as the redis password.

From our discussions with AWS, while the IAM token is valid to use as a password within 15 minutes of obtaining it, once the connection to MemoryDB is established, it would remain active until the expiration of the IAM role of the credentials used. Additionally, we used an assumed role for which the credentials can last for a maximum of 12 hours because the instance profile credentials (from curl http://169.254.169.254/latest/meta-data/iam/security-credentials/MyEc2IAMRole/) is only valid for 6 hours.

We are generating the MemoryDB (which requires AWS SigV4 signing of various parameters) password from the following steps:

  1. Using Python boto3 lib with the code sample here: https://docs.aws.amazon.com/memorydb/latest/devguide/LambdaMemoryDB.step2.html#LambdaMemoryDB.step2.1, modified to use boto3.assume_role() in the class like so:

        sts_client = boto3.client('sts')
        assumed_role = sts_client.assume_role(
            RoleArn=self.iam_role,
            DurationSeconds=self.role_duration,
            RoleSessionName='MyMemoryDBRole',
            )
    
        # Get the assumed role credentials
        self.credentials = assumed_role['Credentials']
        session = boto3.Session(
            aws_access_key_id=self.credentials['AccessKeyId'],
            aws_secret_access_key=self.credentials['SecretAccessKey'],
            aws_session_token=self.credentials['SessionToken'],
        )
  2. before returning the credentials, we found, through trial and error as it was not documented, that we had to url-encode the resulting generated token-password like so (python from urllib.parse import quote):
    return quote(signed_url.removeprefix("https://"), safe='')
  3. the resulting Redis URI that is passed to the juicefs mount command is like so (the redis password is now an extremely long string): rediss://redisuser:myrediscluster%2F%3FAction%3Dconnect%26User%3Dredisuser%26X-Amz-Algorithm%3DAWS4-HMAC-SHA256%26X-Amz-Credential%3DASIASTQGD3GMLGAIE903%252F20240903%252Faws-region%252Fmemorydb%252Faws4_request%26X-Amz-Date%3D20240903T113520Z%26X-Amz-Expires%3D900%26X-Amz-SignedHeaders%3Dhost%26X-Amz-Security-Token%3DFwoGZXIvYXdzELX%252F%252F%252F%252F%252F%252F%252F%252F%252F%252FwEaDEIVKGkLU4EVPByY9SK1AaBao61ZLOMnkL8h8tx3JD3MyZNnbKOW1j5iS%252Bf5n7DYGO7T9HGY8njpqjrfJQCJwo3J4aCxhMx8Ff4J3c%252FaaNPQ%252FBI%252BIoUzFgtHemzkYpAUKQOv1uQjHYNJIAETrse98%252FiM4f7pD%252BvB%252FNCueSkDDmXSKUB8G0pDc%252Fm4DTjbDfp21T8zdO%252Bu3rN%252BgR7YASokTIjD%252F34m%252By9pZ9ZgxVucODr7XgFVaD%252BgE1%252BgGsZt8KdRijrNumco%252BOjbtgYyLZlkPFWedwE92AU1UD5ooYzQedr9bc5dqpKptxwNjtyMVVy9XlFIVyALXSi0xA%253D%253D%26X-Amz-Signature%3Dc5dbbc42e96cc3f987e11747b479572b26ce90827d4262a4201fddb48102ecd5@clustercfg.myrediscluster.asdfghj.memorydb.aws-region.amazonaws.com:6379/0

Question: Does JuiceFS open any more new connections to Redis at times other than the first time the mount, or does it reuse the connection established when the instant the mount connection was issued?

If JuiceFS does indeed try to establish new connections when read/writes increase, that could explain why the password is no longer valid, if more than 15 minutes have elapsed from the time the mount command was issued.

Is there a way we can make JuiceFS spawn multiple Redis connections (that are kept alive, so that MemoryDB doesn't terminate them) at the start so that read/write processes could just reuse these opened connections, rather than opening new ones on demand?

Thank you! P.S. JuiceFS rocks! We see significant performance gains over AWS EFS.

davies commented 2 months ago

The go-redis use a pool of connections to access MemoryDB, so the additional connections are created based on request/workload. A single connection can only handle a request/response at a time.

We could ask go-redis for this feature, but even with that, I think it will still be easy to run into this problem.