reanahub / reana-job-controller

REANA Job Controller
http://reana-job-controller.readthedocs.io/
MIT License
2 stars 38 forks source link

Condor jobs can't be submitted from user with groudID 0 #146

Closed roksys closed 5 years ago

roksys commented 5 years ago

How to reproduce an error?

$ export CONDOR_USER= $ useradd -Ms --gid 0 /bin/bash $CONDOR_USER $ sudo -u $CONDOR_USER/bin/bash -c 'condor_submit job.sub' ERROR: Submitting jobs as user/group 0 (root) is not allowed for security reasons.

It is possible to bypass this error by adding -disable flag to submission command. condor_submit -disable.

Although it allows to submit jobs, but they will stay HELD.

$  condor_q 2068392.0 -analyze

-- Schedd: bigbird14.cern.ch : <137.138.44.75:9618?...

2068392.000:  Job is held.

Hold reason: Failed to initialize user log to /var/reana/users/00000000-0000-0000-0000-000000000000/workflows/303ec603-c018-42f1-abac-b8c0faa0641b/hello.2068392.log or /dev/null
roksys commented 5 years ago

With GID=100 it does not work as well

Hold reason: Failed to initialize user log to /var/reana/users/00000000-0000-0000-0000-000000000000/workflows/c5bbfc07-2038-494c-a3ad-d0a44661b13c/hello.2069866.log or /dev/null

khurtado commented 5 years ago

When you tested with gid=100, did you also check /var/reana/users/00000000-0000-0000-0000-000000000000/workflows/303ec603-c018-42f1-abac-b8c0faa0641b group is gid=100 and the chmod permissions are a recursive 775 rather than the standard 755 (so users in that group can write to it?)

I imagine the same applies to gid=0, but I would personally avoid adding users to the root group.

roksys commented 5 years ago

I do sudo chown -R $CONDOR_USER <workspace>, but then condor can't write to log files.

WARNING: File /home/rmaciula/hello.2070944.0.err is not writable by condor.

WARNING: File /home/rmaciula/hello.2070944.0.out is not writable by condor.

I am wondering which user is trying to write to log files.

-rw-r--r-- 1 rmaciula root   0 May  4 19:11 hello.2070950.0.err
-rw-r--r-- 1 rmaciula root   0 May  4 19:11 hello.2070950.0.out
-rw-r--r-- 1 rmaciula root   0 May  4 19:11 hello.2070950.log
khurtado commented 5 years ago

@roksys Is /home a local disk or some network filesystem (should be defined in /etc/fstab)? Also, was condor started as root? (ps aux | grep condor_master)

roksys commented 5 years ago

Job submission is done from dockerized client , so I don't think there is condor running. The problem was that there was no shared file system between the client and the schedd. In our case we need to use condor asynchronously with condor_submit -spool and condor_transfer_data.

Thanks for your help @khurtado!