Running Nextflow using Kubernetes and NFS persistent volume with non-root user will result in a permission denied exception

glmanhtu commented 5 years ago

Bug report

Hello guys. I was trying to run nextflow with Kubernetes and have encountered some sort of permission problem with non-root user. Nextflow was trying to initialise the Pod but got an exception of "container init caused \"mkdir /mnt/tvu: permission denied". Below is the Kubernetes manifest file generated by nextflow:

apiVersion: v1
kind: Pod
metadata:
  name: nf-68b883e7affa9dc5d6d5e721c75b21c4
  namespace: default
  labels: {app: nextflow, runName: sleepy-goodall, taskName: indexPeptides, processName: indexPeptides,
    sessionId: uuid-0e5b6c81-4527-4213-a9ba-16e40e424220}
spec:
  restartPolicy: Never
  containers:
  - name: nf-68b883e7affa9dc5d6d5e721c75b21c4
    image: omicsdi/crux:latest
    command: [/bin/bash, -ue, .command.run]
    workingDir: /mnt/tvu/work/68/b883e7affa9dc5d6d5e721c75b21c4
    volumeMounts:
    - {name: vol-1, mountPath: /mnt}
  securityContext: {runAsUser: 2801}
  volumes:
  - name: vol-1
    persistentVolumeClaim: {claimName: pride-pv-claim}

I have assigned rw permission for user 2801 to the NFS persistent volume. When I removed the workingDir, the pod was started successfully and I was be able to cd into the /mnt/tvu/work/68/b883e7affa9dc5d6d5e721c75b21c4 folder. So, I suspect that the runAsUser and workingDir couldn't get along.

As my opinion, I think this is the issue of Kubernetes it self because if the given user has rw permission on the workingDir then the pod should be able to start successfully. However, I think Nextflow can do it better by setting the working directory inside the command.run file instead of using workingDir parameter.

Expected behavior and actual behavior

Expect the pod to be able to start successfully.

Steps to reproduce the problem

Create a NFS storage with read/write permission assigned to user id 2801

Create a persistent volume for it

apiVersion: v1
kind: PersistentVolume
metadata:
name: pride-nfs
spec:
capacity:
storage: 5Ti
accessModes:
- ReadWriteMany
nfs:
server: <server ip>
path: <nfs path>

create a persistent volume claim

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pride-pv-claim  
spec:
storageClassName: ""
accessModes:
- ReadWriteMany
resources:
requests:
  storage: 5Ti

run nextflow
```
nextflow kuberun https://github.com/glmanhtu/nf-workflows -v pride-pv-claim:/mnt -profile kubernetes
```
From here, nextflow should show an exception of unknown reason. Using kubectl get pods command to check which pod was created (it's name should be something like nf-5f444e721a0ca7373744d096756fd62a) and then using kubectl describe pod to see the error message

Program output

azorin-ml:Desktop tvu$ nextflow kuberun https://github.com/glmanhtu/nf-workflows -v pride-pv-claim:/mnt -profile kubernetes
Launcher pod spec file: .nextflow.pod.yaml
Pod started: scruffy-poisson
N E X T F L O W  ~  version 19.04.1
Launching `glmanhtu/nf-workflows` [scruffy-poisson] - revision: 2270dc57fe [master]
NOTE: Your local project version looks outdated - a different revision is available in the remote repository [71fea5c3b6]
[warm up] executor > k8s
executor >  k8s (1)
[09/68a59b] process > indexPeptides [  0%] 0 of 1

executor >  k8s (1)
[09/68a59b] process > indexPeptides [  0%] 0 of 1
ERROR ~ Error executing process > 'indexPeptides'

Caused by:
  Process `indexPeptides` terminated for an unknown reason -- Likely it has been terminated by the external system

Command executed:

  crux tide-index small-yeast.fasta yeast-index
  crux tide-search --compute-sp T --mzid-output T demo.ms2 yeast-index

Command exit status:
  -

Command output:
  (empty)

Work dir:
  /mnt/tvu/work/09/68a59b00385ccdf4473941fc217367

Tip: when you have fixed the problem you can continue the execution appending to the nextflow command line the option `-resume`

 -- Check '.nextflow.log' file for details

executor >  k8s (1)
[09/68a59b] process > indexPeptides [100%] 1 of 1, failed: 1 ✘
ERROR ~ Error executing process > 'indexPeptides'

Caused by:
  Process `indexPeptides` terminated for an unknown reason -- Likely it has been terminated by the external system

Command executed:

  crux tide-index small-yeast.fasta yeast-index
  crux tide-search --compute-sp T --mzid-output T demo.ms2 yeast-index

Command exit status:
  -

Command output:
  (empty)

Work dir:
  /mnt/tvu/work/09/68a59b00385ccdf4473941fc217367

Tip: when you have fixed the problem you can continue the execution appending to the nextflow command line the option `-resume`

 -- Check '.nextflow.log' file for details

.nextflow.log:

Aug-08 09:24:43.997 [main] DEBUG nextflow.cli.Launcher - $> nextflow kuberun 'https://github.com/glmanhtu/nf-workflows' -v 'pride-pv-claim:/mnt' -profile kubernetes
Aug-08 09:24:44.215 [main] DEBUG nextflow.scm.AssetManager - Repository URL: https://github.com/glmanhtu/nf-workflows; Project: glmanhtu/nf-workflows; Hub provider: github
Aug-08 09:24:44.224 [main] DEBUG nextflow.scm.RepositoryProvider - Request [credentials -:-] -> https://api.github.com/repos/glmanhtu/nf-workflows/contents/nextflow.config
Aug-08 09:24:44.805 [main] DEBUG nextflow.config.ConfigBuilder - Found config local: /Users/tvu/Documents/Projects/workflow/ms-crux-id-nf/nextflow.config
Aug-08 09:24:44.806 [main] DEBUG nextflow.config.ConfigBuilder - User config file: nextflow.scm.ProviderPath@14fc1f0
Aug-08 09:24:44.806 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /Users/tvu/Documents/Projects/workflow/ms-crux-id-nf/nextflow.config
Aug-08 09:24:44.807 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /Users/tvu/Documents/Projects/workflow/ms-crux-id-nf/nextflow.config
Aug-08 09:24:44.827 [main] DEBUG nextflow.config.ConfigBuilder - Applying config profile: `kubernetes`
Aug-08 09:24:45.308 [main] DEBUG nextflow.config.ConfigBuilder - Applying config profile: `kubernetes`
Aug-08 09:24:45.337 [main] DEBUG nextflow.config.ConfigBuilder - Available config profiles: [kubernetes, docker]
Aug-08 09:24:45.565 [main] DEBUG nextflow.k8s.K8sConfig - Kubernetes workDir=/mnt/tvu/work; projectDir=/mnt/projects; volumeClaims=[pride-pv-claim]
Aug-08 09:24:45.777 [main] INFO  nextflow.k8s.K8sDriverLauncher - Launcher pod spec file: .nextflow.pod.yaml

Environment

Nextflow version: version 19.04.1 build 5072
Java version: 1.8.0_191-b12
Operating system: macOS

pditommaso commented 5 years ago

However, I think Nextflow can do it better by setting the working directory inside the command.run file instead of using workingDir parameter.

Try specifying process.scratch = true in the config file.

glmanhtu commented 5 years ago

However, I think Nextflow can do it better by setting the working directory inside the command.run file instead of using workingDir parameter.

Try specifying process.scratch = true in the config file.

The configuration was already there, you can see it here: https://github.com/glmanhtu/nf-workflows/blob/master/nextflow.config

pditommaso commented 5 years ago

How are you launching the pipeline?

glmanhtu commented 5 years ago

How are you launching the pipeline? Here is the command:

nextflow kuberun https://github.com/glmanhtu/nf-workflows -v pride-pv-claim:/mnt -profile kubernetes

pditommaso commented 5 years ago

I feat that the config is not properly propagated (similarly to #1050).

What if you create a nextflow.config in the launching dir adding process.scratch = true, then try to execute it again.

glmanhtu commented 5 years ago

I feat that the config is not properly propagated (similarly to #1050).

What if you create a nextflow.config in the launching dir adding process.scratch = true, then try to execute it again.

I tried and the result is the same. I added process.scratch = true into nextflow.config

nextflowVersion = '1.2+' 
process.scratch = true

profiles {
  docker {
    docker {
      enabled = true
    }
  }
  kubernetes {
    process.executor = 'k8s'
    process.scratch = true
    k8s {
      debug.yaml = true      
      pod = [runAsUser: 2801]
    }
  }
}

Command to run: nextflow kuberun https://github.com/glmanhtu/nf-workflows -v pride-pv-claim:/mnt -profile kubernetes

output:

azorin-ml:ms-crux-id-nf tvu$ nextflow kuberun https://github.com/glmanhtu/nf-workflows -v pride-pv-claim:/mnt -profile kubernetes
Launcher pod spec file: .nextflow.pod.yaml
Pod started: silly-plateau
N E X T F L O W  ~  version 19.04.1
Pulling glmanhtu/nf-workflows ...
downloaded from https://github.com/glmanhtu/nf-workflows.git
Launching `glmanhtu/nf-workflows` [silly-plateau] - revision: 51f5cd4284 [master]
[warm up] executor > k8s
executor >  k8s (1)
[30/317ad3] process > indexPeptides [  0%] 0 of 1

executor >  k8s (1)
[30/317ad3] process > indexPeptides [  0%] 0 of 1
ERROR ~ Error executing process > 'indexPeptides'

Caused by:
  Process `indexPeptides` terminated for an unknown reason -- Likely it has been terminated by the external system

Command executed:

  crux tide-index small-yeast.fasta yeast-index
  crux tide-search --compute-sp T --mzid-output T demo.ms2 yeast-index

Command exit status:
  -

Command output:
  (empty)

Work dir:
  /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

 -- Check '.nextflow.log' file for details

executor >  k8s (1)
[30/317ad3] process > indexPeptides [100%] 1 of 1, failed: 1 ✘
ERROR ~ Error executing process > 'indexPeptides'

Caused by:
  Process `indexPeptides` terminated for an unknown reason -- Likely it has been terminated by the external system

Command executed:

  crux tide-index small-yeast.fasta yeast-index
  crux tide-search --compute-sp T --mzid-output T demo.ms2 yeast-index

Command exit status:
  -

Command output:
  (empty)

Work dir:
  /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

 -- Check '.nextflow.log' file for details

Pod created: nf-30317ad36174f9e2a08dbb6046d8c3e6 Pod error message:

OCI runtime create failed: container_linux.go:345: starting container process caused "process_linux.go:424: container init caused \"mkdir /mnt/tvu: permission denied\"": unknown

pditommaso commented 5 years ago

Can you copy here the .command.run script for that task?

glmanhtu commented 5 years ago

Can you copy here the .command.run script for that task?

.command.run

#!/bin/bash
# NEXTFLOW TASK: indexPeptides
set -e
set -u
NXF_DEBUG=${NXF_DEBUG:=0}; [[ $NXF_DEBUG > 1 ]] && set -x
NXF_ENTRY=${1:-nxf_main}

nxf_date() {
    local ts=$(date +%s%3N); [[ $ts == *3N ]] && date +%s000 || echo $ts
}

nxf_env() {
    echo '============= task environment ============='
    env | sort | sed "s/\(.*\)AWS\(.*\)=\(.\{6\}\).*/\1AWS\2=\3xxxxxxxxxxxxx/"
    echo '============= task output =================='
}

nxf_kill() {
    declare -a children
    while read P PP;do
        children[$PP]+=" $P"
    done < <(ps -e -o pid= -o ppid=)

    kill_all() {
        [[ $1 != $$ ]] && kill $1 2>/dev/null || true
        for i in ${children[$1]:=}; do kill_all $i; done
    }

    kill_all $1
}

nxf_mktemp() {
    local base=${1:-/tmp}
    if [[ $(uname) = Darwin ]]; then mktemp -d $base/nxf.XXXXXXXXXX
    else TMPDIR="$base" mktemp -d -t nxf.XXXXXXXXXX
    fi
}

on_exit() {
    exit_status=${nxf_main_ret:=$?}
    printf $exit_status > /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/.exitcode
    set +u
    [[ "$tee1" ]] && kill $tee1 2>/dev/null
    [[ "$tee2" ]] && kill $tee2 2>/dev/null
    [[ "$ctmp" ]] && rm -rf $ctmp || true
    rm -rf $NXF_SCRATCH || true
    exit $exit_status
}

on_term() {
    set +e
    [[ "$pid" ]] && nxf_kill $pid
}

nxf_launch() {
    /bin/bash -ue /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/.command.sh
}

nxf_stage() {
    true
    # stage input files
    rm -f small-yeast.fasta
    rm -f demo.ms2
    ln -s /mnt/projects/glmanhtu/nf-workflows/data/small-yeast.fasta small-yeast.fasta
    ln -s /mnt/projects/glmanhtu/nf-workflows/data/demo.ms2 demo.ms2
}

nxf_unstage() {
    true
    cp .command.out /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/.command.out || true
    cp .command.err /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/.command.err || true
    [[ ${nxf_main_ret:=0} != 0 ]] && return
    mkdir -p /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6
    mkdir -p /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/crux-output && cp -fRL crux-output/tide-search.target.txt /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/crux
    mkdir -p /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/crux-output && cp -fRL crux-output/tide-search.decoy.txt /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/crux-
}

nxf_main() {
    trap on_exit EXIT
    trap on_term TERM INT USR1 USR2

    NXF_SCRATCH="$(set +u; nxf_mktemp $TMPDIR)"
    [[ $NXF_DEBUG > 0 ]] && nxf_env
    touch /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/.command.begin
    set +u
    set -u
    [[ $NXF_SCRATCH ]] && echo "nxf-scratch-dir $HOSTNAME:$NXF_SCRATCH" && cd $NXF_SCRATCH
    nxf_stage

    set +e
    local ctmp=$(set +u; nxf_mktemp /dev/shm 2>/dev/null || nxf_mktemp $TMPDIR)
    local cout=$ctmp/.command.out; mkfifo $cout
    local cerr=$ctmp/.command.err; mkfifo $cerr
    tee .command.out < $cout &
    tee1=$!
    tee .command.err < $cerr >&2 &
    tee2=$!
    ( nxf_launch ) >$cout 2>$cerr &
    pid=$!
    wait $pid || nxf_main_ret=$?
    wait $tee1 $tee2
    nxf_unstage
}

$NXF_ENTRY

.command.yaml

apiVersion: v1
kind: Pod
metadata:
  name: nf-30317ad36174f9e2a08dbb6046d8c3e6
  namespace: default
  labels: {app: nextflow, runName: silly-plateau, taskName: indexPeptides, processName: indexPeptides,
    sessionId: uuid-600b27a0-298d-4b01-88e8-ea3f68ecf077}
spec:
  restartPolicy: Never
  containers:
  - name: nf-30317ad36174f9e2a08dbb6046d8c3e6
    image: omicsdi/crux:latest
    command: [/bin/bash, -ue, .command.run]
    workingDir: /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6
    volumeMounts:
    - {name: vol-1, mountPath: /mnt}
  securityContext: {runAsUser: 2801}
  volumes:
  - name: vol-1
    persistentVolumeClaim: {claimName: pride-pv-claim}

pditommaso commented 5 years ago

I guess the problem is when it copies the result from the scratch dir to the shared dir

    mkdir -p /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6
    mkdir -p /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/crux-output && cp -fRL crux-output/tide-search.target.txt /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/crux
    mkdir -p /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/crux-output && cp -fRL crux-output/tide-search.decoy.txt /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/crux-

Not sure how much I can help here. You need to have that dir writable.

glmanhtu commented 5 years ago

I guess the problem is when it copies the result from the scratch dir to the shared dir

    mkdir -p /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6
    mkdir -p /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/crux-output && cp -fRL crux-output/tide-search.target.txt /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/crux
    mkdir -p /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/crux-output && cp -fRL crux-output/tide-search.decoy.txt /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/crux-

Not sure how much I can help here. You need to have that dir writable.

Actually, the dir is writeable for the user 2801 as I specified. However, I suspect the securityContext is not applied before the workingDir directive, that caused the permission denied. So, if there is a way to remove the workingDir directive then it should be working fine.

pditommaso commented 5 years ago

Can try to hack that yaml and the .commad.run script to make it work.

glmanhtu commented 5 years ago

Can try to hack that yaml and the .commad.run script to make it work.

I have tried to delete the workingDir directive and update the command from command: [/bin/bash, -ue, .command.run] to command: [/bin/bash, -ue, /mnt/tvu/work/30/317ad36174f9e2a08dbb6046d8c3e6/.command.run] and then start it manually by command kubectl apply -f .command.yaml then it was run successfuly.

azorin-ml:testing tvu$ kubectl log nf-30317ad36174f9e2a08dbb6046d8c3e6
log is DEPRECATED and will be removed in a future version. Use logs instead.
nxf-scratch-dir nf-30317ad36174f9e2a08dbb6046d8c3e6:/tmp/nxf.dXKyOaBpKT
INFO: Beginning tide-index.
INFO: Writing results to output directory 'crux-output'.
INFO: CPU: nf-30317ad36174f9e2a08dbb6046d8c3e6
INFO: Fri Aug  9 13:18:38 UTC 2019
INFO: Running tide-index...
INFO: Writing results to output directory 'yeast-index'.
INFO: Reading small-yeast.fasta and computing unmodified peptides...
INFO: Writing decoy fasta...
INFO: Reading proteins
INFO: Precomputing theoretical spectra...
INFO: Elapsed time: 0.0265 s
INFO: Finished crux tide-index.
INFO: Return Code:0
INFO: Beginning tide-search.
WARNING: The output directory 'crux-output' already exists.
Existing files will not be overwritten.
INFO: CPU: nf-30317ad36174f9e2a08dbb6046d8c3e6
INFO: Fri Aug  9 13:18:38 UTC 2019
INFO: Running tide-search...
INFO: Reading index yeast-index
INFO: Reading spectra file demo.ms2
INFO: Converting demo.ms2 to spectrumrecords format
INFO: Sorting spectra
INFO: Running search
INFO: Elapsed time: 0.375 s
INFO: Finished crux tide-search.
INFO: Return Code:0

But the thing is these files are generated by nextflow so, we can't do it all by our self. If you add some sort of options in nextflow that will do it automatically then it will be great.

pditommaso commented 5 years ago

That's could be a possible patch, but I can't incorporate it directly in the main code base without more rigorous testing and assessment of pros&cons. I would suggest, you create your own build and test it, then send a PR.

glmanhtu commented 5 years ago

That's could be a possible patch, but I can't incorporate it directly in the main code base without more rigorous testing and assessment of pros&cons. I would suggest, you create your own build and test it, then send a PR.

We found another workaround solution for this problem by setting up the permission to the NFS storage it self.

pditommaso commented 5 years ago

Leaving it open because it may be useful to handle this on NF side.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

matthewstuartedwards commented 1 year ago

We found another workaround solution for this problem by setting up the permission to the NFS storage it self.

I'm encountering this issue without Kubernetes, but with an NFS mount. I'm curious what NFS permissions glmanhtu changed here.

nextflow-io / nextflow