bottlerocket-os / bottlerocket

An operating system designed for hosting containers
https://bottlerocket.dev
Other
8.73k stars 515 forks source link

samba mount error (but same command worked on self-managed EKS nodes of amazonlinux2) #3332

Open tooptoop4 opened 1 year ago

tooptoop4 commented 1 year ago

Image I'm using: OS Image: Bottlerocket OS 1.14.2 (aws-k8s-1.25) Container Runtime Version: containerd://1.6.20+bottlerocket particular container is based on python:3.10.11-slim with apt install -y cifs-utils

What I expected to happen: mount command to onprem windows share would succeed

What actually happened: bash exit code 32

mount.cifs kernel mount options: ip=redactedIP,unc=\\redacted\Shared,seal,vers=3.0,user=redacted,domain=NTADMIN,prefixpath=redacted\redacted,pass=********
mount error(4): Interrupted system call
Refer to the mount.cifs(8) manual page (e.g. man mount.cifs) and kernel log messages (dmesg)

dmesg shows:

[72908.566487] Key type dns_resolver registered
[72908.658049] Key type cifs.spnego registered
[72908.658056] Key type cifs.idmap registered
[72908.658944] CIFS: Attempting to mount \\redacted\Shared
[72908.864573] CIFS: VFS: cifs_mount failed w/return code = -4

Note that exact same steps work on EKS 1.25 self-managed nodes based on amazonlinux2 with same python image for containers as above. I've also confirmed /etc/resolv.conf have same values and that port is reachable (timeout 7 bash -c ">/dev/tcp/redact/445" &>/dev/null && echo "Online" || echo "Offline") Also files can be accessed from this jar: https://github.com/hierynomus/smbj

How to reproduce the problem:

mkdir -p /mnt/my_mount_dir
echo -e "username=redact\npassword=redact" > p.ini
mount -vvv -t cifs -o "domain=NTADMIN,credentials=/p.ini,seal,vers=3.0" '\\redacted\Shared\redacted\redacted' /mnt/my_mount_dir
echo $?
dmesg
stmcginnis commented 1 year ago

Hi @tooptoop4 - thanks for reporting this!

It's hard to tell what could be causing this from the output so far. The Interrupted system call error is a bit vague for understanding what caused it to be interrupted.

Can you share a little more about the environment? Is the samba running on a linux host, a share from Windows, or a different storage system? If Windows, do you know if this is a DFS share?

One thing we could try is enabling debug logging for the CIFS driver. That should give a lot more details, but it needs kernel lockdown to be enabled, so the instance needs to either be launched with settings.kernel.lockdown=none or changed at runtime with apiclient set settings.kernel.lockdown=none; apiclient reboot.

Once lockdown=none, if you connect to the Admin container and run sheltie, that will get you an interactive prompt. From there you can try:

modprobe cifs  # ensure CIFS driver is loaded so we can enable debug flags
dmesg -n 8
echo 'module cifs +p' >/sys/kernel/debug/dynamic_debug/control
echo -n 2 >/proc/fs/cifs/cifsFYI

That will turn on debug mode. You can then try the mount again and check the output to see if it gives any more details about what is happening.

tooptoop4 commented 1 year ago

its a DFS share from Windows. will need more time to dig through the next instructions

tooptoop4 commented 9 months ago

from what i can tell changes to /etc/resolv.conf on the pod are not taking effect