piraeusdatastore / piraeus-operator

The Piraeus Operator manages LINSTOR clusters in Kubernetes.
https://piraeus.io/
Apache License 2.0
383 stars 60 forks source link

Udev-rules are not working for piraeus-provisioned volumes #291

Open kvaps opened 2 years ago

kvaps commented 2 years ago

Not sure if this is a bug or feature, so let's consider this first.

The piraeus-operator does not share drbd configuration between the satellite pod and node, thus even if node has drbd-utils installed, they will never generate symlinks in /dev/drbd/by-disk/ /dev/drbd/by-res/ for linstor devices.

To fix this issue we would need to share /etc/drbd.d/linstor-resources.res and /var/lib/linstor.d/ between the host and satellite pod.

see udev-rules from drbd-utils for more information https://github.com/LINBIT/drbd-utils/blob/05c0797248af6f4e3b5b04545fe068dba41e3d81/scripts/drbd.rules.in#L6

WanzenBug commented 2 years ago

In short: it's neither a bug nor a feature. It just isn't really relevant for Piraeus working.

Do you have a specific use case where you need these symlinks?

kvaps commented 2 years ago

Sometimes it can simplify the debugging and management, not more in this case.

WanzenBug commented 2 years ago

Yeah, I sometimes wished for that to work myself. I even experimented a bit with having udev be triggered in the container (which works, at least in theory, since we are in a privileged container). But I never finished it, because I thought the effort wasn't worth it.

kvaps commented 2 years ago

I think this is not a bad idea. We can run linstor-node with hostPid: true, then generate udev rule by the piraeus-entry.sh script. Which will use nsenter to access the container, eg:

sed "s|\(IMPORT{program}=\).*|\1\"/usr/bin/nsenter -m -t $$ -- drbdadm sh-udev minor-%m\"|" /lib/udev/rules.d/65-drbd.rules > /udev-rules.d/65-drbd.rules

the result would be:

# This file contains the rules to create named DRBD devices.

SUBSYSTEM!="block", GOTO="drbd_end"
KERNEL!="drbd*", GOTO="drbd_end"

IMPORT{program}="/usr/bin/nsenter -m -t 15137 -- drbdadm sh-udev minor-%m"

# Use symlink from the environment if available
# some udev version thought it was a good idea to change a long established
# default of string_escape=none to string_escape=replace :-/
# therefore, recent enough drbdadm will no longer export space separated lists.
ENV{SYMLINK_BY_DISK}!="", SYMLINK+="$env{SYMLINK_BY_DISK}"
ENV{SYMLINK_BY_RES}!="", SYMLINK+="$env{SYMLINK_BY_RES}", GOTO="have_symlink"
ENV{SYMLINK}!="", OPTIONS+="string_escape=none", SYMLINK="$env{SYMLINK}", GOTO="have_symlink"

# Legacy rules for older DRBD 8.3 & 8.4 when drbdadm sh-udev did not yet export SYMLINK
ENV{DISK}!="", SYMLINK+="drbd/by-disk/$env{DISK}"
ENV{RESOURCE}!="", SYMLINK+="drbd/by-res/$env{RESOURCE}"

LABEL="have_symlink"

ENV{DEVICE}=="drbd_?*", SYMLINK+="$env{DEVICE}"

LABEL="drbd_end"

where 15137 is pid of /usr/bin/piraeus-entry.sh on the host namespace.

Nsenter is installed by default with util-linux on every linux distro. So you don't depend on docker or crictl here.

Possible problem, that if user would install drbd-utils some time on the node, it will ask him to override this file.

This is just thought, its up to consideration

WanzenBug commented 2 years ago

Since working on Operator 2.0 I'm also back thinking about this issue. Specifically these points:

Possible problem, that if user would install drbd-utils some time on the node, it will ask him to override this file.

Udev can read rules from /run/udev/rules.d, which I believe has higher priority then whatever is installed by drbd-utils, so we could make use of that.

We can run linstor-node with hostPid: true...

I'm wondering if there is some other way to configure that. We need the reference to the mount-namespace of the satellite process, but maybe there is some other way to get that.

Reason being: I actually want to move away from the host... options as much as possible, let the satellite "just" be a normal privileged container. That's also why DRBD 9.2 will support running from different network namespaces (spoiler, I don't think this made it to github quite yet)

bc185174 commented 2 years ago

@kvaps @WanzenBug

Not sure if this is a bug or feature, so let's consider this first.

The piraeus-operator does not share drbd configuration between the satellite pod and node, thus even if node has drbd-utils installed, they will never generate symlinks in /dev/drbd/by-disk/ /dev/drbd/by-res/ for linstor devices.

To fix this issue we would need to share /etc/drbd.d/linstor-resources.res and /var/lib/linstor.d/ between the host and satellite pod.

see udev-rules from drbd-utils for more information https://github.com/LINBIT/drbd-utils/blob/05c0797248af6f4e3b5b04545fe068dba41e3d81/scripts/drbd.rules.in#L6

We do have a use-case for this mount between the satellite pod and the host for /etc/drbd.d/linstor-resources.res and /var/lib/linstor.d/. Some legacy systems already have drbd installed on the host with drbdadm. drbdadm expects the pvc resource config file to exist in /var/lib/linstor.d.

I am more than happy to contribute a PR for this and it would be ideal if this was configurable via the helm-values file.

WanzenBug commented 2 years ago

Some legacy systems already have drbd installed on the host with drbdadm

That may complicate things again, since if you installed drbdadm on the host, you probably already have the udev rules installed, and you don't need to mess with the rules, instead you'd only need the linstor override in /etc/drbd.d and bind-mount /var/lib/linstor.d on the host.

bc185174 commented 2 years ago

Some legacy systems already have drbd installed on the host with drbdadm

That may complicate things again, since if you installed drbdadm on the host, you probably already have the udev rules installed, and you don't need to mess with the rules, instead you'd only need the linstor override in /etc/drbd.d and bind-mount /var/lib/linstor.d on the host.

Yes - we would only need to create the bind-mounts for it to work.

kvaps commented 2 years ago

I still thinking that we should passtrough all the drbd and linstor configuration directories. That was working fine for ages with kube-linstor project. We were using containerized LINSTOR with OpenNebula driver.

https://github.com/kvaps/kube-linstor/blob/0f2be0f7cb3cb1475d45eb0247a4365309955887/helm/kube-linstor/templates/satellite-daemonset.yaml#L112-L121

bc185174 commented 2 years ago

I still thinking that we should passtrough all the drbd and linstor configuration directories. That was working fine for ages with kube-linstor project. We were using containerized LINSTOR with OpenNebula driver.

https://github.com/kvaps/kube-linstor/blob/0f2be0f7cb3cb1475d45eb0247a4365309955887/helm/kube-linstor/templates/satellite-daemonset.yaml#L112-L121

I'd agree, suggest making this optional though - understand not everyone will want this change!