juicedata / juicefs

JuiceFS is a distributed POSIX file system built on top of Redis and S3.
https://juicefs.com
Apache License 2.0
10.73k stars 937 forks source link

Boot process gets stucked when mounting via systemd-mount #4116

Open lowshoe opened 12 months ago

lowshoe commented 12 months ago

When trying to automatically mount a juicefs volume via a systemd.mount-unit, the boot process gets stucked after "Mounted JuiceFS".

image

The boot process halts at this point, the following services, for example the ssh-daemon, don't get started.

What you expected to happen: The juicefs-Volume gets mounted and the boot process then runs completely through to the end.

How to reproduce it (as minimally and precisely as possible):

Create and enable a systemd.mount as documented at https://juicefs.com/docs/community/mount_juicefs_at_boot_time#automating-mounting-with-systemdmount

For Example:

systemctl cat mnt-s3.mount

[Unit]
Description=Juicefs

[Mount]
Environment="META_PASSWORD=abcdefg12345"
What=rediss://my.keydb.local:6379/1
Where=/mnt/s3
Type=juicefs
Options=_netdev,user_id=1002,group_id=1002,cache-size=1024000

[Install]
WantedBy=remote-fs.target
WantedBy=multi-user.target

Anything else we need to know?

systemctl status mnt-s3.mount --no-pager ● mnt-s3.mount - Juicefs Loaded: loaded (/etc/systemd/system/mnt-s3.mount; disabled; preset: disabled) Active: active (mounted) since Thu 2023-10-19 14:59:21 CEST; 53s ago Until: Thu 2023-10-19 14:59:21 CEST; 53s ago Where: /mnt/s3 What: JuiceFS:myjfs Tasks: 12 (limit: 153401) Memory: 107.3M CPU: 309ms CGroup: /system.slice/mnt-s3.mount └─1551 "/sbin/mount.juicefs rediss://my.keydb.local:6379/1 /mnt/s3 -o rw,user_id=1002,group_id=1002,cache-size=1024000,_netdev"

Oct 19 14:59:21 my.host.local mount[1528]: 2023/10/19 14:59:21.087691 juicefs[1528] : Meta address: rediss://my.redis.local:6379/1 [interface.go:497] Oct 19 14:59:21 my.host.local mount[1528]: 2023/10/19 14:59:21.123640 juicefs[1528] : AOF is not enabled, you may lose data if Redis is not shutdown pro… [info.go:84] Oct 19 14:59:21 my.host.local mount[1528]: 2023/10/19 14:59:21.124005 juicefs[1528] : Ping redis latency: 297.474µs [redis.go:3572] Oct 19 14:59:21 my.host.local mount[1528]: 2023/10/19 14:59:21.125009 juicefs[1528] : Data use s3://vol-www/myjfs/ [mount.go:605] Oct 19 14:59:21 my.host.local mount[1528]: 2023/10/19 14:59:21.125910 juicefs[1528] : Disk cache (/var/jfsCache/96a19d9e-6e9b-44e2-b202-4d87592015d0/): cap…cache.go:114] Oct 19 14:59:21 my.host.local /sbin/mount.juicefs[1551]: juicefs[1551] : Create session 35 OK with version: 1.1.0+2023-09-04.08c4ae6 [base.go:492] Oct 19 14:59:21 my.host.local /sbin/mount.juicefs[1551]: juicefs[1551] : Prometheus metrics listening on 127.0.0.1:9567 [mount.go:160] Oct 19 14:59:21 my.host.local /sbin/mount.juicefs[1551]: juicefs[1551] : Mounting volume myjfs at /mnt/s3 ... [mount_unix.go:269] Oct 19 14:59:21 my.host.local mount[1528]: 2023/10/19 14:59:21.629189 juicefs[1528] : OK, myjfs is ready at /mnt/s3 [mount_unix.go:48] Oct 19 14:59:21 my.host.local systemd[1]: Mounted Juicefs.



**Environment**:
- JuiceFS version: 1.1.0+2023-09-04.08c4ae6
- OS (e.g `cat /etc/os-release`): Oracle Linux Server 9.2
- Kernel (e.g. `uname -a`): Linux 5.15.0-106.131.4.el9uek.x86_64
- Object storage (cloud provider and region, or self maintained): self maintained
- Metadata engine info (version, cloud provider managed or self maintained): self maintained KeyDB v6.3.3
- Network connectivity (JuiceFS to metadata engine, JuiceFS to object storage): Local Net
- Others:
lowshoe commented 12 months ago

Short update: if i add "nofail" to the mount options, it works.

Hexilee commented 12 months ago

Hi @lowshoe , there are some details are required:

Since the "nofail" option works, maybe the juicefs mount stuck because the keydb is unready at that time.

lowshoe commented 11 months ago

@Hexilee ,

Could it be that the juicefs-mount is running in the foreground and so is blocking the ongoing of the boot process? /sbin/mount.juicefs is a symbolic link to /usr/local/bin/juicefs and the default mode for juicefs is to run mount in foreground...

Hexilee commented 11 months ago

Actually, /sbin/mount.juicefs will run in the background until you set env JFS_FOREGROUND=1. Since nofail works for your case, the systemd mount may think /sbin/mount.juicefs fails to run.

There may be some checkers to ensure the mount point is ready refer to https://man.archlinux.org/man/systemd-mount.1.en#OPTIONS . Could you try the no-block or fsck options whether work?

solracsf commented 11 months ago

If this helps, i've had some problems too using mount, i've switched to a service since.

[Unit]
Description=JuiceFS mount
AssertFileIsExecutable=/usr/local/bin/juicefs
After=network-online.target systemd-resolved.service
Requires=network-online.target

[Service]
# Environment variables
Environment=JFS_RSA_PASSPHRASE=xxxxxxxxxxxxxx

# Can't be Notify, Not Implemented
Type=simple

# Ensure S3 enpoint can be resolved
ExecStartPre=/bin/sh -c 'while ! host s3.endpoint.com; do /bin/sleep 0.2; done'

# Mount filesystem
ExecStart=/usr/local/bin/juicefs mount \
   --attr-cache 3600 \
   --backup-meta 0 \
   --cache-dir /var/cache/juicefs \
   --writeback \
   "badger:///var/lib/badger" \
   /opt/data

ExecStop=/bin/sh -c 'while lsof | grep /opt/data >/dev/null; do /bin/sleep 2; done'
ExecStop=/usr/local/bin/juicefs umount /opt/data

# Let systemd restart this service
Restart=always
RestartSec=5

# Disable timeout logic and wait until process is stopped
TimeoutStopSec=infinity
SendSIGKILL=no

# Priorities
OOMScoreAdjust=-500
IOSchedulingPriority=1

[Install]
WantedBy=multi-user.target