juicedata / juicefs

JuiceFS is a distributed POSIX file system built on top of Redis and S3.
https://juicefs.com
Apache License 2.0
10.83k stars 954 forks source link

Fluid+JuiceFS but juice worker restart many times #3203

Open andyzheung opened 1 year ago

andyzheung commented 1 year ago

1、juicefs version: juicefsruntime-controller:v0.8.2 juicefs-fuse:v1.0.0 2、juicefs worker jfs-ceph-demo-worker-0 2/2 Running 14 13d 3、logs: 023/01/28 01:58:09.319687 juicefs[7] : Meta address: redis://:@mymaster,redis-sentinel-prod-node-0.redis-sentinel-prod-headless.redis-sentinel,redis-sentinel-prod-node-1.redis-sentinel-prod-headless.redis-sentinel,redis-sentinel-prod-node-2.redis-sentinel-prod-headless.redis-sentinel:26379/2 [interface.go:402] redis: 2023/01/28 01:58:09 sentinel.go:700: sentinel: discovered new sentinel="redis-sentinel-prod-node-2.redis-sentinel-prod-headless.redis-sentinel.svc.cluster.local:26379" for master="mymaster" redis: 2023/01/28 01:58:09 sentinel.go:700: sentinel: discovered new sentinel="redis-sentinel-prod-node-1.redis-sentinel-prod-headless.redis-sentinel.svc.cluster.local:26379" for master="mymaster" redis: 2023/01/28 01:58:09 sentinel.go:661: sentinel: new master="mymaster" addr="redis-sentinel-prod-node-0.redis-sentinel-prod-headless.redis-sentinel.svc.cluster.local:6379" 2023/01/28 01:58:09.813084 juicefs[7] : Ping redis: 111.548442ms [redis.go:2878] 2023/01/28 01:58:09.863370 juicefs[7] : Data use s3://xxx-juicefs/ceph-jfs/ [format.go:435] 2023/01/28 01:58:09.918755 juicefs[7] : Volume is formatted as { "Name": "ceph-jfs", "UUID": "7b4a1644-32f6-4a4b-9574-21b2a02de61f", "Storage": "s3", "Bucket": "http://xxx-juicefs.10.240.62.11:8080", "AccessKey": "simu-user-2", "SecretKey": "removed", "BlockSize": 4096, "Compression": "none", "KeyEncrypted": true, "TrashDays": 1, "MetaVersion": 1 } [format.go:472] 2023/01/28 01:58:10.197372 juicefs[32] : Meta address: redis://:@mymaster,redis-sentinel-prod-node-0.redis-sentinel-prod-headless.redis-sentinel,redis-sentinel-prod-node-1.redis-sentinel-prod-headless.redis-sentinel,redis-sentinel-prod-node-2.redis-sentinel-prod-headless.redis-sentinel:26379/2 [interface.go:402] redis: 2023/01/28 01:58:10 sentinel.go:700: sentinel: discovered new sentinel="redis-sentinel-prod-node-2.redis-sentinel-prod-headless.redis-sentinel.svc.cluster.local:26379" for master="mymaster" redis: 2023/01/28 01:58:10 sentinel.go:700: sentinel: discovered new sentinel="redis-sentinel-prod-node-1.redis-sentinel-prod-headless.redis-sentinel.svc.cluster.local:26379" for master="mymaster" redis: 2023/01/28 01:58:10 sentinel.go:661: sentinel: new master="mymaster" addr="redis-sentinel-prod-node-0.redis-sentinel-prod-headless.redis-sentinel.svc.cluster.local:6379" 2023/01/28 01:58:11.237205 juicefs[32] : Ping redis: 327.986739ms [redis.go:2878] 2023/01/28 01:58:11.470577 juicefs[32] : Data use s3://xxx-juicefs/ceph-jfs/ [mount.go:422] 2023/01/28 01:58:11.471008 juicefs[32] : Disk cache (/dev/shm/7b4a1644-32f6-4a4b-9574-21b2a02de61f/): capacity (102400 MB), free ratio (10%), max pending pages (15) [disk_cache.go:93] 2023/01/28 01:58:11.701112 juicefs[32] : Create session 15 OK with version: 1.0.0+2022-08-08.cf0c269b [base.go:266] 2023/01/28 01:58:11.703540 juicefs[32] : Prometheus metrics listening on [::]:9567 [mount.go:160] 2023/01/28 01:58:11.703737 juicefs[32] : Mounting volume ceph-jfs at /runtime-mnt/juicefs/xxx/jfs-ceph-demo/juicefs-fuse ... [mount_unix.go:181] 2023/01/28 01:58:12.149224 juicefs[32] : OK, ceph-jfs is ready at /runtime-mnt/juicefs/xxx/jfs-ceph-demo/juicefs-fuse [mount_unix.go:45] 2023/01/28 01:58:13.257048 juicefs[32] : close session 15: %!s() [base.go:333]

zwwhdls commented 1 year ago

Hi @andyzheung , can you check if there is liveness probe in juicefs worker pod ? Maybe probe timeout and results in pod restart.