CollaboraOnline / online

Collabora Online is a collaborative online office suite based on LibreOffice technology. This is also the source for the Collabora Office apps for iOS and Android.
https://collaboraonline.com
Other
1.85k stars 703 forks source link

Rootless Collabora Sporadically comes down - Jail Permissions and `nftw()` Issues #5247

Closed stellarpower closed 6 months ago

stellarpower commented 2 years ago

Describe the bug A clear and concise description of what the bug is.

To Reproduce

Description

We are in a rootless container under podman, behind a reverse proxy (so TLS not required and set to termination), pretty much the default compose setup as given in the online docs.

I have been launching privileged whilst trying to work this out, ideally would then cut back permissions later.

We were getting a whole load of messages about failure to bind mount, so have set mount_jail_tree to false in the configs to quieten these and redeployed.

However, after some time, the active document is "unable to connect to the server" and nextcloud bails out from that page:

frk-00032-00032 2022-09-05 20:14:44.392904 +0000 [ forkit ] WRN  The systemplate directory [/opt/cool/systemplate] is read-only, and at least [/opt/cool/systemplate//etc/hosts]
kit-06079-00032 2022-09-05 20:14:44.443905 +0000 [ kit_spare_315 ] ERR  linkOrCopy: nftw() failed for '/opt/cool/systemplate'| kit/Kit.cpp:485
kit-06079-00032 2022-09-05 20:14:44.444347 +0000 [ kit_spare_315 ] ERR  statfs failed on '/opt/cool/child-roots/KN7t2In13TTMceFD/lo/' (ENOENT: No such file or directory)| kit/K
kit-06079-00032 2022-09-05 20:14:44.453264 +0000 [ kit_spare_315 ] ERR  linkOrCopy: nftw() failed for '/opt/collaboraoffice'| kit/Kit.cpp:485
kit-06079-00032 2022-09-05 20:14:44.454189 +0000 [ kit_spare_315 ] ERR  Error while copying from /etc/passwd to /opt/cool/child-roots/KN7t2In13TTMceFD//etc/passwdP4VXbxxbLH53: 
kit-06079-00032 2022-09-05 20:14:44.454338 +0000 [ kit_spare_315 ] ERR  Failed to update the dynamic files in the jail [/opt/cool/child-roots/KN7t2In13TTMceFD/]. If the systemp
kit-06079-00032 2022-09-05 20:14:44.454540 +0000 [ kit_spare_315 ] ERR  mknod(/opt/cool/child-roots/KN7t2In13TTMceFD//tmp/dev/random) failed. Mount must not use nodev flag. (EP
kit-06079-00032 2022-09-05 20:14:44.454824 +0000 [ kit_spare_315 ] ERR  mknod(/opt/cool/child-roots/KN7t2In13TTMceFD//tmp/dev/urandom) failed. Mount must not use nodev flag. (E

Collabora Office 22.05 - Fatal Error: The application cannot be started. 
User installation could not be completed. 

frk-00032-00032 2022-09-05 20:14:45.439107 +0000 [ forkit ] WRN  No live Kits exist, and we are not terminating yet.| kit/ForKit.cpp:304
sh: 1: /usr/bin/coolmount: Operation not permitted
frk-00032-00032 2022-09-05 20:14:45.507096 +0000 [ forkit ] ERR  Failed to unmount [/opt/cool/child-roots/KN7t2In13TTMceFD/tmp]| common/JailUtil.cpp:72
sh: 1: /usr/bin/coolmount: Operation not permitted
frk-00032-00032 2022-09-05 20:14:45.565769 +0000 [ forkit ] ERR  Failed to unmount [/opt/cool/child-roots/KN7t2In13TTMceFD/lo]| common/JailUtil.cpp:72
sh: 1: /usr/bin/coolmount: Operation not permitted
frk-00032-00032 2022-09-05 20:14:45.640892 +0000 [ forkit ] ERR  Failed to unmount [/opt/cool/child-roots/KN7t2In13TTMceFD]| common/JailUtil.cpp:72

(full logs available below).

I've seen several messages with it still trying to mount /dev/random into one of the jails, and we get a message saying that the systemplate is mounted read-only (don't know why, it's not a volume), and that it's failing to walk the filesystem tree in linkOrCopy(nftw() failed).

At this point, nextcloud still complains that it can't connect to the server upon closing the previous page and opening a different document.

Restarting the container may have helped sporadically, the first time we restarted we then got a segfault in the OConfigurationTreeRoot ctor from coolforkit's globalPreinit - this is included in the full logs below, and I presume the segfault is due ot a messup in the container root-caused by the above.

I also concerningly noted in early startup that collabora was trying to remove and replace coolwsd.xml, which it then failed to do as this was bound in as a config.

It seems to me that the image is requiring permissions that are typically not available in a rootless setup - either by default or by necessity (e.g. making device nodes should not be possible on the host unless running as root, so there is no way to achieve this in side the container). If there isn't a reasonably straightforward fix to this issue, I guess potentially the way the jails are architected may need to be reconsidered if launching without a rootful container setup.

Thanks!

Logs

coolwsd.xml

As the example from the repository, but with mount_jail_tree set to false. link

Compose file:

  collabora:
    container_name:    collabora
    image:             docker.io/collabora/code:latest # image is b71c22066674 (don't see a tag anymore on docker hub)
    privileged:        true
    volumes:
      - /...//collabora/coolwsd-working.xml:/etc/coolwsd/coolwsd.xml
    environment:
      - aliasgroup1=nextcloud.domain.com:443
      - server_name=collabora.domain.com
      - dictionaries=en
      - extra_params=--o:ssl.enable=false --o:ssl.termination=true --o:logging.color=true
      - username=admin
      - password=password
    restart:           always

frk-00032-00032 [ forkit ] WRN  The systemplate directory [/opt/cool/systemplate] is read-only, and at least [/opt/cool/systemplate//etc/hosts]>
kit-06079-00032 [ kit_spare_315 ] ERR  linkOrCopy: nftw() failed for '/opt/cool/systemplate'| kit/Kit.cpp:485
kit-06079-00032 [ kit_spare_315 ] ERR  statfs failed on '/opt/cool/child-roots/KN7t2In13TTMceFD/lo/' (ENOENT: No such file or directory)| kit/K>
kit-06079-00032 [ kit_spare_315 ] ERR  linkOrCopy: nftw() failed for '/opt/collaboraoffice'| kit/Kit.cpp:485
kit-06079-00032 [ kit_spare_315 ] ERR  Error while copying from /etc/passwd to /opt/cool/child-roots/KN7t2In13TTMceFD//etc/passwdP4VXbxxbLH53: >
kit-06079-00032 [ kit_spare_315 ] ERR  Failed to update the dynamic files in the jail [/opt/cool/child-roots/KN7t2In13TTMceFD/]. If the systemp>
kit-06079-00032 [ kit_spare_315 ] ERR  mknod(/opt/cool/child-roots/KN7t2In13TTMceFD//tmp/dev/random) failed. Mount must not use nodev flag. (EP>
kit-06079-00032 [ kit_spare_315 ] ERR  mknod(/opt/cool/child-roots/KN7t2In13TTMceFD//tmp/dev/urandom) failed. Mount must not use nodev flag. (E>
Collabora Office 22.05 - Fatal Error: The application cannot be started. 
User installation could not be completed. 
frk-00032-00032 [ forkit ] WRN  No live Kits exist, and we are not terminating yet.| kit/ForKit.cpp:304
sh: 1: /usr/bin/coolmount: Operation not permitted
frk-00032-00032 [ forkit ] ERR  Failed to unmount [/opt/cool/child-roots/KN7t2In13TTMceFD/tmp]| common/JailUtil.cpp:72
sh: 1: /usr/bin/coolmount: Operation not permitted
frk-00032-00032 [ forkit ] ERR  Failed to unmount [/opt/cool/child-roots/KN7t2In13TTMceFD/lo]| common/JailUtil.cpp:72
sh: 1: /usr/bin/coolmount: Operation not permitted
frk-00032-00032 [ forkit ] ERR  Failed to unmount [/opt/cool/child-roots/KN7t2In13TTMceFD]| common/JailUtil.cpp:72

... after restarting... :

frk-00027-00027 [ coolforkit ] INF  Ignored setting RLIMIT_FSIZE to unlimited.| common/Seccomp.cpp:291
frk-00027-00027 [ coolforkit ] INF  Ignored setting RLIMIT_NOFILE to unlimited.| common/Seccomp.cpp:291
coolforkit version details: 22.05.5.4 - d58a5e2
frk-00027-00027 [ coolforkit ] DBG  New SocketPoll [UnitKit] owned by 0x7fe9c7e63780| net/Socket.cpp:192

frk-00027-00027 [ coolforkit ] INF  Have capability cap_sys_chroot| kit/ForKit.cpp:246
frk-00027-00027 [ coolforkit ] INF  Have capability cap_mknod| kit/ForKit.cpp:246
frk-00027-00027 [ coolforkit ] INF  Have capability cap_fowner| kit/ForKit.cpp:246
frk-00027-00027 [ coolforkit ] INF  Have capability cap_chown| kit/ForKit.cpp:246

frk-00027-00027 [ coolforkit ] TRC  dlopen(/opt/collaboraoffice/program/libmergedlo.so, RTLD_GLOBAL|RTLD_NOW)| kit/Kit.cpp:3037
frk-00027-00027 [ coolforkit ] TRC  Invoking lok_preinit_2(/opt/collaboraoffice/program", "file:///tmp/user")| kit/Kit.cpp:3091

Bootstrapping exception 'file:///opt/collaboraoffice/program/services/services.rdb: no such file /opt/collaboraoffice/debugsource/cppuhelper/source/servicemanager.cxx:1379'
frk-00027-00027 2022-09-05 20:25:12.125266 +0000 [ coolforkit ] SIG   Fatal signal received: SIGSEGV code: 1 for address: 0x0
Recent activity:

Backtrace 27 - forkit startup of 22.05.5.4 d58a5e2:

/usr/bin/coolforkit                          (SigUtil::dumpBacktrace()+0x85)[0x55ba121f4905]
/usr/bin/coolforkit                          (+0x1f5a65)                    [0x55ba121f5a65]
/lib/x86_64-linux-gnu/libpthread.so.0        (+0x12980)                     [0x7fe9c64e8980]
/opt/collaboraoffice/program/libmergedlo.so  (+0x148ef22)                   [0x7fe9c1689f22]
/opt/collaboraoffice/program/libmergedlo.so  (+0x30f83a5)                   [0x7fe9c32f33a5]
/opt/collaboraoffice/program/libmergedlo.so  (utl::OConfigurationTreeRoot::OConfigurationTreeRoot(com::sun::star::uno::Reference<com::sun::star::uno::XComponentContext> const&, r>
/opt/collaboraoffice/program/libmergedlo.so  (+0x24841bb)                   [0x7fe9c267f1bb]
/opt/collaboraoffice/program/libmergedlo.so  (+0x24b46ff)                   [0x7fe9c26af6ff]
/opt/collaboraoffice/program/libmergedlo.so  (lok_preinit_2+0x18)           [0x7fe9c26b0cb8]
/usr/bin/coolforkit                          (globalPreinit(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x226)[0x55ba1219c086]
/usr/bin/coolforkit                          (main+0x15a8)[0x55ba1213c3f8]
/lib/x86_64-linux-gnu/libc.so.6              (__libc_start_main+0xe7)       [0x7fe9c6106c87]
/usr/bin/coolforkit                          (_start+0x2a)                  [0x55ba1214514a]

wsd-00001-00026 [ prisoner_poll ] TRC  Poll completed with 0 live polls max (5000000us)(timedout)| net/Socket.cpp:356
wsd-00001-00026 [ prisoner_poll ] TRC  #19: Starting handling poll results of prisoner_poll at index 0 (of 1): 0| net/Socket.cpp:435

Another attempt - container removed and replaced:

wsd-00001-00031 [ prisoner_poll ] WRN  Removing dead spare child [9806].| wsd/COOLWSD.cpp:459
wsd-00001-00031 [ prisoner_poll ] WRN  Prisoner connection disconnected but without valid socket.| wsd/COOLWSD.cpp:3244
wsd-00001-00031 [ prisoner_poll ] WRN  An unassociated Kit disconnected.| wsd/COOLWSD.cpp:3259
wsd-00001-00031 [ prisoner_poll ] WRN  Prisoner connection disconnected but without valid socket.| wsd/COOLWSD.cpp:3244
wsd-00001-00031 [ prisoner_poll ] WRN  An unassociated Kit disconnected.| wsd/COOLWSD.cpp:3259
kit-09813-00032 [ kit_spare_4fb ] ERR  linkOrCopy: nftw() failed for '/opt/cool/systemplate'| kit/Kit.cpp:485
kit-09813-00032 [ kit_spare_4fb ] ERR  statfs failed on '/opt/cool/child-roots/KXOfxR271uXpCsfj/lo/' (ENOENT: No such file or directory)| kit/Kit.cpp:185
kit-09813-00032 [ kit_spare_4fb ] ERR  linkOrCopy: nftw() failed for '/opt/collaboraoffice'| kit/Kit.cpp:485
kit-09813-00032 [ kit_spare_4fb ] ERR  Error while copying from /etc/hosts to /opt/cool/child-roots/KXOfxR271uXpCsfj//etc/hostsqgEkT9P2FpDB: Failed to open dest /opt/cool/child-roots/KXOfxR271uXpCsfj//etc/hostsqgEkT9P2FpDB| common/FileUtil.cpp:162
kit-09813-00032 [ kit_spare_4fb ] ERR  Failed to update the dynamic files in the jail [/opt/cool/child-roots/KXOfxR271uXpCsfj/]. If the systemplate directory is owned by a superuser or is read-only, running the installation scripts with the owner's account should update these files. Some functionality may be missing.| kit/Kit.cpp:2731
kit-09813-00032 [ kit_spare_4fb ] ERR  mknod(/opt/cool/child-roots/KXOfxR271uXpCsfj//tmp/dev/random) failed. Mount must not use nodev flag. (EPERM: Operation not permitted)| common/JailUtil.cpp:263
kit-09813-00032 [ kit_spare_4fb ] ERR  mknod(/opt/cool/child-roots/KXOfxR271uXpCsfj//tmp/dev/urandom) failed. Mount must not use nodev flag. (EPERM: Operation not permitted)| common/JailUtil.cpp:275
Collabora Office 22.05 - Fatal Error: The application cannot be started. 
User installation could not be completed. 
frk-00032-00032 [ forkit        ] WRN  No live Kits exist, and we are not terminating yet.| kit/ForKit.cpp:304
Security: coolmount incorrect user-name, other than 'cool'
Aborting.
frk-00032-00032 [ forkit        ] ERR  Failed to unmount [/opt/cool/child-roots/KXOfxR271uXpCsfj/tmp]| common/JailUtil.cpp:72
Security: coolmount incorrect user-name, other than 'cool'
Aborting.
frk-00032-00032 [ forkit        ] ERR  Failed to unmount [/opt/cool/child-roots/KXOfxR271uXpCsfj/lo]| common/JailUtil.cpp:72
Security: coolmount incorrect user-name, other than 'cool'
Aborting.
frk-00032-00032 [ forkit        ] ERR  Failed to unmount [/opt/cool/child-roots/KXOfxR271uXpCsfj]| common/JailUtil.cpp:72
wsd-00001-09799 [ docbroker_02e ] WRN  Waking up dead poll thread [HttpSynReqPoll], started: false, finished: false| net/Socket.hpp:727

Desktop (please complete the following information)

Nayesz commented 1 year ago

i'm facing the same problem.. any progress?

stellarpower commented 1 year ago

So far things seem to be stable having pulled a new image since opening the issue, although we aren;t sure for definite this hasn't come up at some point since then.

timur-g commented 9 months ago

Can you please retest this and comment, also see if related to #2800.

timur-g commented 6 months ago

No response, I close.