keybase / client

Keybase Go Library, Client, Service, OS X, iOS, Android, Electron
BSD 3-Clause "New" or "Revised" License
8.91k stars 1.23k forks source link

/keybase mount, on linux, behaves poorly - hangs `ls /` #26864

Open mcint opened 5 months ago

mcint commented 5 months ago

Keybase reliably causes ls / to hang.

Additionally, when I "Quit Keybase", which is unreasonably hidden and discouraged, echoes of dark pattern fashion, it doesn't clean up the mount, and the same issue still occurs!

statx(AT_FDCWD, "/usr", AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW|AT_NO_AUTOMOUNT, STATX_MODE|STATX_NLINK|STATX_UID|STATX_GID|STATX_MTIME|STATX_SIZE, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFDIR|0755, stx_size=14, ...}) = 0
lgetxattr("/usr", "security.selinux", 0x63ab931f4fe0, 255) = -1 ENODATA (No data available)
getxattr("/usr", "system.posix_acl_access", NULL, 0) = -1 ENODATA (No data available)
getxattr("/usr", "system.posix_acl_default", NULL, 0) = -1 ENODATA (No data available)
statx(AT_FDCWD, "/etc", AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW|AT_NO_AUTOMOUNT, STATX_MODE|STATX_NLINK|STATX_UID|STATX_GID|STATX_MTIME|STATX_SIZE, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFDIR|0755, stx_size=285, ...}) = 0
lgetxattr("/etc", "security.selinux", 0x63ab931f5000, 255) = -1 ENODATA (No data available)
getxattr("/etc", "system.posix_acl_access", NULL, 0) = -1 ENODATA (No data available)
getxattr("/etc", "system.posix_acl_default", NULL, 0) = -1 ENODATA (No data available)
statx(AT_FDCWD, "/run", AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW|AT_NO_AUTOMOUNT, STATX_MODE|STATX_NLINK|STATX_UID|STATX_GID|STATX_MTIME|STATX_SIZE, {stx_mask=STATX_BASIC_STATS|STATX_MNT_ID, stx_attributes=STATX_ATTR_MOUNT_ROOT, stx_mode=S_IFDIR|0755, stx_size=1160, ...}) = 0
lgetxattr("/run", "security.selinux", 0x63ab931f5020, 255) = -1 ENODATA (No data available)
getxattr("/run", "system.posix_acl_access", NULL, 0) = -1 ENODATA (No data available)
getxattr("/run", "system.posix_acl_default", NULL, 0) = -1 ENODATA (No data available)
statx(AT_FDCWD, "/keybase", AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW|AT_NO_AUTOMOUNT, STATX_MODE|STATX_NLINK|STATX_UID|STATX_GID|STATX_MTIME|STATX_SIZE, ^C

Start with debugging

[sudo] systemctl status keybase* -> keybase.mount is the only systemd service installed.

$ sudo systemctl status keybase.mount
 ● keybase.mount - /keybase
      Loaded: loaded (/proc/self/mountinfo)
      Active: active (mounted) since Wed 2024-05-15 13:31:38 PDT; 1h 22min ago
       Where: /keybase
        What: keybase-redirector
         CPU: 8ms

After quitting keybase, not merely closing it, the keybase.mount is still mounted, but now throws errors,

$ ls /
ls: cannot access '/keybase': Transport endpoint is not connected
...

Resolve this with systemctl stop keybase.mount, only after keybase has been quit.

System Information

$ lsb_release -a
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.4 LTS
Release:    22.04
Codename:   jammy
$ uname -rvs
Linux 6.5.0-28-generic #29~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Apr  4 14:39:20 UTC 2
doqfgc commented 5 months ago

Can confirm this is the case.

It also causes all Flatpaks to hang or not start at all while Keybase is running.

jbvv1 commented 5 months ago

I also believe that this same issue causes df to hang; sudo strace df showed that df was hanging when enumerating the /keybase mount. After I unloaded the keybase mount using systemctl stop keybase.mount, I was able to run df without any issues.

phouverneyuff commented 5 months ago

Same issue on Ubuntu 24.04

I need to start keybase -> kill it -> umount /keybase -> run keybase again

GwynethLlewelyn commented 5 months ago

Ah. Interesting. I have one server still on Ubuntu 22.04.4. For a while now (a few weeks?), there has been an issue in mounting /keybase — it doesn't work at all. I haven't figured out why, but the error was similar to what you report as "hanging". Or it would refuse to mount. The culprit seems not to be the Keybase daemon at all, not even the FUSE subsystem, but rather the ingenious Redirector — that's the one that fails to work.

What happens when you do a systemctl --user status keybase-redirector.service (as the non-privileged user)?

In my case, I get the following:

× keybase-redirector.service - Keybase Root Redirector for KBFS
     Loaded: loaded (/usr/lib/systemd/user/keybase-redirector.service; enabled; vendor preset: enabled)
     Active: failed (Result: exit-code) since Wed 2024-06-05 13:00:14 WEST; 53s ago
    Process: 1199828 ExecStartPre=/usr/bin/keybase --use-root-config-file config get --direct --assert-false --assert-ok-on-nil disable-root-redirector (code=exited, status=2)
        CPU: 41ms

Jun 05 13:00:14 myServer systemd[1093]: keybase-redirector.service: Scheduled restart job, restart counter is at 5.
Jun 05 13:00:14 myServer systemd[1093]: Stopped Keybase Root Redirector for KBFS.
Jun 05 13:00:14 myServer systemd[1093]: keybase-redirector.service: Start request repeated too quickly.
Jun 05 13:00:14 myServer systemd[1093]: keybase-redirector.service: Failed with result 'exit-code'.
Jun 05 13:00:14 myServer systemd[1093]: Failed to start Keybase Root Redirector for KBFS.

The journal is not very informative, either (journalctl --user -u keybase-redirector.service):

[...]
Jun 05 13:00:14 myServer keybase[1199817]: true
Jun 05 13:00:14 myServer keybase[1199817]: 2024-06-05T13:00:14.489592+01:00 ▶ [ERRO keybase main.go:86] 001 Assertion failed.
Jun 05 13:00:14 myServer systemd[1093]: keybase-redirector.service: Control process exited, code=exited, status=2/INVALIDARGUMENT
Jun 05 13:00:14 myServer systemd[1093]: keybase-redirector.service: Failed with result 'exit-code'.
Jun 05 13:00:14 myServer systemd[1093]: Failed to start Keybase Root Redirector for KBFS.
Jun 05 13:00:14 myServer systemd[1093]: keybase-redirector.service: Scheduled restart job, restart counter is at 4.
Jun 05 13:00:14 myServer systemd[1093]: Stopped Keybase Root Redirector for KBFS.
Jun 05 13:00:14 myServer systemd[1093]: Starting Keybase Root Redirector for KBFS...
Jun 05 13:00:14 myServer keybase[1199828]: true
Jun 05 13:00:14 myServer keybase[1199828]: 2024-06-05T13:00:14.739666+01:00 ▶ [ERRO keybase main.go:86] 001 Assertion failed.
Jun 05 13:00:14 myServer systemd[1093]: keybase-redirector.service: Control process exited, code=exited, status=2/INVALIDARGUMENT
Jun 05 13:00:14 myServer systemd[1093]: keybase-redirector.service: Failed with result 'exit-code'.
Jun 05 13:00:14 myServer systemd[1093]: Failed to start Keybase Root Redirector for KBFS.
Jun 05 13:00:14 myServer systemd[1093]: keybase-redirector.service: Scheduled restart job, restart counter is at 5.
Jun 05 13:00:14 myServer systemd[1093]: Stopped Keybase Root Redirector for KBFS.
Jun 05 13:00:14 myServer systemd[1093]: keybase-redirector.service: Start request repeated too quickly.
Jun 05 13:00:14 myServer systemd[1093]: keybase-redirector.service: Failed with result 'exit-code'.
Jun 05 13:00:14 myServer systemd[1093]: Failed to start Keybase Root Redirector for KBFS.

My quick hack was to turn it off, and simply look for my files at /run/user/1000/keybase/kbfs/ (my UID is 1000, as you can guess). Since I'm the only user on that server which has Keybase enabled for their account, I'm seriously considering turning /keybase into just a symlink to /run/user/1000/keybase/kbfs/! 🤣

jliedy commented 4 months ago

Same issue running Ubuntu 24.04 with the /keybase mount point hanging. Causes chromium based browser (and a couple other apps) to hang for a few while it scans the FS on certain events. Have to unmount it in order to keep some apps running smoothly.

mattbenscho commented 2 months ago

I have this issue as well, that's why I use keybase only on my phone for now (I just use the chat). I subscribed to this issue in case there is any activity.

jlp78 commented 2 months ago

@GwynethLlewelyn 's notes are consistent with the behavior I've seen, as well. The keybase mount itself works fine, no problems. The problem is in the /keybase mount, which is serviced by keybase-redirector. systemctl --user stop keybase-redirector.service terminates the mount and unblocks anything that was waiting on it. Correspondingly, systemctl --user disable keybase-redirector.service keeps it from starting when you log in. This isn't a solution, but at least gets your system unblocked. As I recall, this was why we never mounted NFS filesystems directly under /... if the file server went away, everything that traversed / would hang. Looks like that bit of lore got lost over the decades.

jlp78 commented 2 months ago

This is likely the same issue as https://github.com/keybase/client/issues/26108

jlp78 commented 2 months ago

Also https://github.com/keybase/client/issues/26017, https://github.com/keybase/client/issues/24764 (from 2022!), https://github.com/keybase/client/issues/24749, and likely several other open issues.