src-d / coreos-nvidia

Yet another NVIDIA driver container for Container Linux (aka CoreOS)
GNU General Public License v3.0
37 stars 15 forks source link

insmod: ERROR: could not insert module /rootfs/usr/lib64/modules/4.14.19-coreos/kernel/drivers/char/ipmi/ipmi_msghandler.ko: File exists #4

Closed xoss closed 6 years ago

xoss commented 6 years ago
# cat /etc/os-release
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1632.3.0
VERSION_ID=1632.3.0
BUILD_ID=2018-02-14-0338
PRETTY_NAME="Container Linux by CoreOS 1632.3.0 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"
# modprobe ipmi_devintf
# source /etc/os-release
# docker run --rm --privileged --volume /:/rootfs/ srcd/coreos-nvidia:${VERSION}
insmod: ERROR: could not insert module /rootfs/usr/lib64/modules/4.14.19-coreos/kernel/drivers/char/ipmi/ipmi_msghandler.ko: File exists

Based on the recommendation given in issue #3, I was trying to start the container. Before, it failed with the message mentioned in #3, but now I encounter the reported issue and have no idea how to proceed. I was trying it as a normal and as root use. No success. Any advice will be welcome.

Thanks.

christianhuening commented 6 years ago

I have the same issue. Anyone with a suggestion?

trevex commented 6 years ago

Same problem here, the error indicates, that the module is already loaded. Therefore checking the dependency module and only loading them if necessary should solve this:

CMD if ! lsmod | grep "ipmi_msghandler" &> /dev/null; then insmod `find /rootfs/usr -iname ipmi_msghandler.ko`; fi && ...

I am gonna check this on our deployment and if it works open a PR.

trevex commented 6 years ago

I opened a PR as a temporary workaround I am currently using a script:

 - path: /opt/load-nvidia-driver.sh
      filesystem: root
      mode: 0755
      contents:
        inline: |
          #!/bin/bash

          if ! lsmod | grep "ipmi_msghandler" &> /dev/null; then insmod `find /rootfs/usr -iname ipmi_msghandler.ko`; fi
          if ! lsmod | grep "ipmi_devintf" &> /dev/null; then insmod `find /rootfs/usr -iname ipmi_devintf.ko`; fi
          insmod ${NVIDIA_MODULES_PATH}/nvidia.ko
          insmod ${NVIDIA_MODULES_PATH}/nvidia-uvm.ko
          nvidia-mkdevs

And overwrite the entrypoint, e.g. in the systemd unit file:

ExecStartPre=/usr/bin/docker run --rm --privileged --volume /:/rootfs/ srcd/coreos-nvidia:${VERSION} /rootfs/opt/load-nvidia-driver.sh