FRRouting / frr

The FRRouting Protocol Suite
https://frrouting.org/
Other
3.35k stars 1.25k forks source link

ospfd requires cap_sys_admin #8681

Open javier-godoy opened 3 years ago

javier-godoy commented 3 years ago

I had previously used quagga 1.2.4-r2 (particularly ospfd) in a docker container based on alpine 3.11.3

When trying to migrate to frr-7.3.1-r0 in a docker container based on alpine 3.12.3 (binaries installed with apk) I found that ospfd and zebra fail to initialize, with the following message:

privs_init: initial cap_set_proc failed: Operation not permitted
Wanted caps: = cap_net_admin,cap_net_raw,cap_sys_admin+p
Have   caps: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_admin,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+ip

I'm not granting cap_sys_admin to the container, and I would like to avoid it if possible (Quagga's ospfd worked fine with only cap_net_admin and cap_net_raw).

I seem to understand that cap_sys_admin was added to FRR in https://github.com/FRRouting/frr/pull/1818 and the workaround from https://github.com/FRRouting/frr/issues/2007 is not possible since the container is not granted such permission.

Is there any way to avoid the check for cap_sys_admin?

qlyoung commented 3 years ago

We agree with you that minimum privileges is good practice. Unfortunately we need this cap in order to be able to switch network namespaces for namespace based VRFs. From setns(2):

Network, IPC, time, and UTS namespaces
       In order to reassociate itself with a new network, IPC, time, or UTS namespace,
       the caller must have the CAP_SYS_ADMIN capability both in its own user namespace
       and in the user namespace that owns the target namespace.

Quagga didn't have this feature and so didn't require this capability.

I'll note that we only elevate privileges when necessary and drop them when done (you can grep for frr_with_privs(...) {...} blocks if you want to see how it's done).

Of course in theory we could make needing CAP_SYS_ADMIN be a thing thats only compiled in when you compile in namespace VRFs, but nobody's done the work for that and it's unlikely to be done given that it works as is. Sorry for the inconvenience. If you have a patch for it we can definitely work with you to get that in.

javier-godoy commented 3 years ago

Thanks for your quick response. For now I'll recompile with --disable-capabilities (since I'm already restricting the capabilities of the docker container that runs FRR). Based on your explanation I understand that it should work since my usage of FRR doesn't require CAP_SYS_ADMIN.

Once I'm certain that it works as intended I'll submit a PR. My approach would be adding a --disable-cap-sys-admin configuration and initializing _caps_p with or without ZCAP_SYS_ADMIN depending on it (I think there is no configuration flag for compile in namespace VRFs). Would that be OK for you?

pguibert6WIND commented 3 years ago

There is already a flag named '--vrfwnetns' that, if not present, should be used to disable cap-sys-admin wherever needed. this flag is part of zebra, and zebra should send information via zapi to other daemons to inform that the backend is not vrfwnetns.

arent you in the case where you are using BGP in standalone mode as route reflector ?

javier-godoy commented 3 years ago

@pguibert6WIND I'm not starting zebra with --vrfwnetns, and it fails immediately after starting. I think the failure is because the capability isn't permitted, and not an actual attempt to raise it. https://github.com/FRRouting/frr/blob/58ba06470c6b62b4305eec940a4e49b34baae904/lib/privs.c#L312-L315

arent you in the case where you are using BGP in standalone mode as route reflector ?

I'm not using BGP, just OSPF.

qlyoung commented 3 years ago

My approach would be adding a --disable-cap-sys-admin configuration and initializing _caps_p with or without ZCAP_SYS_ADMIN depending on it

Sorry for the delay. Yeah, that sounds appropriate.

pguibert6WIND commented 3 years ago

@pguibert6WIND I'm not starting zebra with --vrfwnetns, and it fails immediately after starting. I think the failure is because the capability isn't permitted, and not an actual attempt to raise it.

https://github.com/FRRouting/frr/blob/58ba06470c6b62b4305eec940a4e49b34baae904/lib/privs.c#L312-L315

arent you in the case where you are using BGP in standalone mode as route reflector ?

I'm not using BGP, just OSPF.

I meant, you don't need a new flag. if you dont use vrfwnetns, frr should default to disable cap sys admin

javier-godoy commented 3 years ago

Please see https://github.com/javier-godoy/frr-alpine/blob/test/init (using FRR 7.5.1 from Alpine 3.13.5)

chown -R frr:frr /etc/frr 
/usr/lib/frr/zebra -v 
/usr/lib/frr/zebra &
/usr/lib/frr/ospfd &
sleep 1
ps aux

When only NET_ADMIN is granted, zebra and ospfd fail with the following log and the processes are not started.

Note that it wants cap_sys_admin=p (not cap_sys_admin itself, but only as a permitted capability, I guess in order to effectively acquire it if/when needed). In this example init is the one who doesn't have cap_sys_admin=p so it cannot grant it to its children, even if they are never going to raise it as an effective capability.

Attaching to frrtest_testfrr_1
testfrr_1  | chown -R frr:frr /etc/frr
testfrr_1  | /usr/lib/frr/zebra -v
testfrr_1  | zebra version 7.5.1
testfrr_1  | Copyright 1996-2005 Kunihiro Ishiguro, et al.
testfrr_1  | configured with:
testfrr_1  |    '--prefix=/usr' '--enable-exampledir=/usr/share/doc/frr/examples/' '--localstatedir=/run/frr' '--sbindir=/usr/lib/frr' '--sysconfdir=/etc/frr' '--libdir=/usr/lib/frr' '--with-moduledir=/usr/lib/frr/modules' '--disable-dependency-tracking' '--enable-systemd=no' '--enable-rpki' '--with-libpam' '--enable-doc' '--enable-doc-html' '--enable-snmp' '--enable-fpm' '--disable-protobuf' '--disable-zeromq' '--enable-ospfapi' '--enable-bgp-vnc' '--enable-multipath=256' '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frrvty' '--enable-configfile-mask=0640' '--enable-logfile-mask=0640' 'CC=gcc' 'CXX=g++' 'PYTHON=python3'
testfrr_1  | /usr/lib/frr/zebra &
testfrr_1  | /usr/lib/frr/ospfd &
testfrr_1  | sleep 1
testfrr_1  | privs_init: initial cap_set_proc failed: Operation not permitted
testfrr_1  | Wanted caps: cap_net_admin,cap_net_raw,cap_sys_admin=p
testfrr_1  | Have   caps: cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_admin,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap=ip
testfrr_1  | privs_init: initial cap_set_proc failed: Operation not permitted
testfrr_1  | Wanted caps: cap_net_bind_service,cap_net_admin,cap_net_raw,cap_sys_admin=p
testfrr_1  | Have   caps: cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_admin,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap=ip
testfrr_1  | ps aux
testfrr_1  | PID   USER     TIME  COMMAND
testfrr_1  |     1 root      0:00 /sbin/tini -- init
testfrr_1  |     6 root      0:00 {init} /bin/sh /usr/local/bin/init
testfrr_1  |    12 root      0:00 ps aux
testfrr_1  | tail -f /dev/null

When both NET_ADMIN and SYS_ADMIN are granted, then zebra and ospfd start correctly:

Attaching to frrtest_testfrr_1
testfrr_1  | chown -R frr:frr /etc/frr
testfrr_1  | /usr/lib/frr/zebra -v
testfrr_1  | /usr/lib/frr/zebra &
testfrr_1  | /usr/lib/frr/ospfd &
testfrr_1  | sleep 1
testfrr_1  | zebra version 7.5.1
testfrr_1  | Copyright 1996-2005 Kunihiro Ishiguro, et al.
testfrr_1  | configured with:
testfrr_1  |    '--prefix=/usr' '--enable-exampledir=/usr/share/doc/frr/examples/' '--localstatedir=/run/frr' '--sbindir=/usr/lib/frr' '--sysconfdir=/etc/frr' '--libdir=/usr/lib/frr' '--with-moduledir=/usr/lib/frr/modules' '--disable-dependency-tracking' '--enable-systemd=no' '--enable-rpki' '--with-libpam' '--enable-doc' '--enable-doc-html' '--enable-snmp' '--enable-fpm' '--disable-protobuf' '--disable-zeromq' '--enable-ospfapi' '--enable-bgp-vnc' '--enable-multipath=256' '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frrvty' '--enable-configfile-mask=0640' '--enable-logfile-mask=0640' 'CC=gcc' 'CXX=g++' 'PYTHON=python3'
testfrr_1  | 2021/06/04 00:45:06 ZEBRA: [EC 4043309111] Disabling MPLS support (no kernel support)
testfrr_1  | ps aux
testfrr_1  | PID   USER     TIME  COMMAND
testfrr_1  |     1 root      0:00 /sbin/tini -- init
testfrr_1  |     6 root      0:00 {init} /bin/sh /usr/local/bin/init
testfrr_1  |     9 frr       0:00 /usr/lib/frr/zebra
testfrr_1  |    10 frr       0:00 /usr/lib/frr/ospfd
testfrr_1  |    15 root      0:00 ps aux
testfrr_1  | tail -f /dev/null
mjstapp commented 3 years ago

Yes, I'm not sure what Phillipe meant: several daemons unconditionally request CAP_SYS_ADMIN currently.

Jean-Daniel commented 3 years ago

I think this issue shouldn't be limited to ospfd and also cover other daemons (zebra, bgpd, …).

I'm trying to setup a simple FRR instance as a BGP speaker (publish routes to load balancer and don't need to handle any incoming routes), and have the same issue.

pguibert6WIND commented 3 years ago

Yes, I'm not sure what Phillipe meant: several daemons unconditionally request CAP_SYS_ADMIN currently.

I was thinking the capabilities were requested by vrfwnetns mode, whereas this is the container that requests it.

yangtzeriverli commented 2 years ago

I think this issue shouldn't be limited to ospfd and also cover other daemons (zebra, bgpd, …).

I'm trying to setup a simple FRR instance as a BGP speaker (publish routes to load balancer and don't need to handle any incoming routes), and have the same issue.

As I am just experiment, I use --privileged to run docker container and no error