hashicorp / vault

A tool for secrets management, encryption as a service, and privileged access management
https://www.vaultproject.io/
Other
31.18k stars 4.21k forks source link

mlock + container + external plugin : fatal error newOSproc #19471

Open taitelman opened 1 year ago

taitelman commented 1 year ago

Describe the bug if I set VAULT_DISABLE_MLOCK=true all works well. however, if I reverse that flag my plugin will crash upon plugin startup:

starting plugin
   runtime: failed to create new OS thread (have 18 already; errno=12)
   fatal error: newosproc

   runtime stack:
   runtime.throw({0x1ceb377, 0xc000a8be30})     /usr/local/go/src/runtime/panic.go:1198 +0x71
   runtime.newosproc(0xc000100800)      /usr/local/go/src/runtime/os_linux.go:160 +0x189
   runtime.newm1(0xc000100800)      /usr/local/go/src/runtime/proc.go:2251 +0xd3
   runtime.newm(0x0, 0xc000092800, 0x0)     /usr/local/go/src/runtime/proc.go:2230 +0xe7
   runtime.startm(0x0, 0x1)     /usr/local/go/src/runtime/proc.go:2485 +0xcf
   runtime.wakep()      /usr/local/go/src/runtime/proc.go:2584 +0x5a
   runtime.resetspinning()      /usr/local/go/src/runtime/proc.go:3216 +0x45
   runtime.schedule()       /usr/local/go/src/runtime/proc.go:3374 +0x25e
   runtime.mstart1()    /usr/local/go/src/runtime/proc.go:1414 +0xcd
   runtime.mstart0()    /usr/local/go/src/runtime/proc.go:1365 +0x79
   runtime.mstart()     /usr/local/go/src/runtime/asm_amd64.s:248 +0x5

   goroutine 1 [select]:
   github.com/hashicorp/go-plugin.Serve(0xc00051fc50)       /go/pkg/mod/github.com/hashicorp/go-plugin@v1.4.3/server.go:469 +0x14b3
   github.com/hashicorp/vault/sdk/plugin.Serve(0xc00051fdb0)    /go/pkg/mod/github.com/hashicorp/vault/sdk@v0.6.0/plugin/serve.go:86 +0x44e
   main.main()      /workspace/app/myplugin/cmd/plugin/main.go:47 +0x36f

is that related to ulimit ?

bash-5.1$ ulimit -u
unlimited
bash-5.1$ echo $GOMAXPROVS

bash-5.1$ cat /proc/sys/kernel/pid_max
513521
bash-5.1$ cat /proc/sys/kernel/threads-max
513521

vault version: 1.9.6 Docker of: Red Hat Enterprise Linux release 9.1 (Plow)

Dockerfile has ofcourse:

FROM registry.access.redhat.com/ubi9/ubi-minimal

RUN microdnf install shadow-utils policycoreutils checkpolicy libselinux-utils -y
RUN microdnf update -y

VOLUME /vault
VOLUME /tmp

COPY --from=builder /bin/vault /usr/bin/vault    <---- builder is the image that has the latest vault exectubale

RUN groupadd vault && useradd vault -g vault

COPY config.hcl /vault/config.hcl
RUN ulimit -c 0

RUN mkdir -p /vault/plugins
COPY sm-plugins /vault/plugins

COPY scripts/*.sh /vault/

RUN chown -R vault:vault /vault

ENV VAULT_DISABLE_MLOCK=false
ENV VAULT_ENABLE_FILE_PERMISSIONS_CHECK=true

RUN setcap cap_ipc_lock=+ep /usr/bin/vault
RUN setcap cap_ipc_lock=+ep /vault/plugins/myplugin

USER vault

CMD ["vault","server","-config","/vault/config.hcl"]

Deployed to Kubernetes 1.24 As vault a container (via deployment.yaml) has special securityContext (run as nonRoot).

 securityContext:
     capabilities:
         add:
         - IPC_LOCK

however, it seems that this capability prevents vault from spawning an external plugin (fork will fail).

taitelman commented 1 year ago

the only solution I found so far is to add root privileges , which is a complete bypass:

 securityContext:
    privileged: true     <--- workaround
    capabilities:
        add:
        - IPC_LOCK

and of course not secured at all for running a pod.

taitelman commented 1 year ago

I suspect the other issue with vault + mlock + external plugins is the

rpc error: code = Unavailable desc = error reading from server: read unix @->/tmp/plugin4213242693: use of closed network connection

it might be a side effect due to plugin not up but maybe vault plugin uses unix pipes for RPC. that may require more linux capabilities

taitelman commented 1 year ago

not really relevant or interesting but the /vault/config.hcl is:

listener "tcp" {
  address     = "0.0.0.0:8200"
  tls_disable = 1
}

storage "inmem" {}

plugin_directory = "/vault/plugins"
api_addr = "http://127.0.0.1:8200"

log_level = "Debug"
log_format = "json"