Closed kcleung closed 7 years ago
Nothing should being stored in /tmp. Temporary run directories should created in /home/jobe/runs and should be deleted when the job finishes unless debugging is globally enabled or set for the particular run. What data are you finding in /tmp? Can you post a long directory listing of /tmp and perhaps of one or two subdirectories (if it has any), please?
Here are the contents of /tmp:
compsci373prd03:/tmp$ ls agent_management.log ccoKZhAa.c hsperfdata_jobe07 cc1aNRsa.o ccopgzKB.ld hsperfdata_jobe08 cc4rO34Q.o ccs55pNk.s hsperfdata_jobe09 cc5t8kn2.s ccUlGJt6.o hsperfdata_kleu044 ccA3eGUw.s ccvyi4WV.o hsperfdata_www-data cccHjsla.ld ccwS0zGA.c jobe_gupliv cceRo3tC.le cczycPuw.o jobe_HIL5au ccFrC1s8.s hsperfdata_jobe00 jobe_language_cache_file ccGcNzX7.o hsperfdata_jobe01 jobe_Nq0ZyN ccgiS4pm.le hsperfdata_jobe02 jobe_Q37ktS cchiMydh.ld hsperfdata_jobe03 krb5cc_0 ccIaYyp6.c hsperfdata_jobe04 last_run_summary.yaml ccJ2Z3da.le hsperfdata_jobe05 vmware-root cckkL7s6.o hsperfdata_jobe06 compsci373prd03:/tmp$
and here is one of the tmp directories:
compsci373prd03:/tmp$ ls jobe_Q37ktS compile.out prog.cmd prog.err prog.in prog.out prog.python3 pycache compsci373prd03:/tmp$
I had a good look at your code. I can also verify that Task->close() in LanguageTask.php, and the line "exec("sudo rm -R {$dir}");" in Task->close() are in fact called, but for some strange reasons have no effect on the system, and the directory is not deleted.
Here is the long directory listing of /tmp:
compsci373prd03:/tmp$ ls -lh total 3.3M -rw-r--r-- 1 root root 412 May 2 09:26 agent_management.log -rw------- 1 jobe00 jobe 0 Apr 27 17:02 cc1aNRsa.o -rw------- 1 jobe00 jobe 0 Apr 27 16:19 cc4rO34Q.o -rw------- 1 jobe06 jobe 1.3M Apr 29 16:50 cc5t8kn2.s -rw------- 1 jobe05 jobe 0 Apr 29 16:51 ccA3eGUw.s -rw------- 1 jobe00 jobe 0 Apr 27 17:02 cccHjsla.ld -rw------- 1 jobe00 jobe 0 Mar 7 13:49 cceRo3tC.le -rw------- 1 jobe03 jobe 163K Apr 29 16:50 ccFrC1s8.s -rw------- 1 jobe03 jobe 66K Apr 29 16:50 ccGcNzX7.o -rw------- 1 jobe00 jobe 0 Apr 27 16:19 ccgiS4pm.le -rw------- 1 jobe00 jobe 0 Mar 7 13:49 cchiMydh.ld -rw------- 1 jobe00 jobe 0 Apr 27 16:19 ccIaYyp6.c -rw------- 1 jobe00 jobe 0 Apr 27 17:02 ccJ2Z3da.le -rw------- 1 jobe03 jobe 174K Apr 29 16:50 cckkL7s6.o -rw------- 1 jobe00 jobe 0 Apr 27 17:02 ccoKZhAa.c -rw------- 1 jobe00 jobe 0 Apr 27 16:19 ccopgzKB.ld -rw------- 1 jobe08 jobe 284K Apr 29 16:52 ccs55pNk.s -rw------- 1 jobe03 jobe 684K Apr 29 16:50 ccUlGJt6.o -rw------- 1 jobe00 jobe 0 Mar 7 13:49 ccvyi4WV.o -rw------- 1 jobe00 jobe 0 Mar 7 13:49 ccwS0zGA.c -rw------- 1 jobe05 jobe 684K Apr 29 16:50 cczycPuw.o drwxr-xr-x 2 jobe00 jobe 6 Apr 4 14:03 hsperfdata_jobe00 drwxr-xr-x 2 jobe01 jobe 6 Apr 3 10:27 hsperfdata_jobe01 drwxr-xr-x 2 jobe02 jobe 6 Apr 3 10:27 hsperfdata_jobe02 drwxr-xr-x 2 jobe03 jobe 6 Apr 3 10:27 hsperfdata_jobe03 drwxr-xr-x 2 jobe04 jobe 6 Apr 3 10:27 hsperfdata_jobe04 drwxr-xr-x 2 jobe05 jobe 6 Apr 3 10:27 hsperfdata_jobe05 drwxr-xr-x 2 jobe06 jobe 6 Apr 3 10:27 hsperfdata_jobe06 drwxr-xr-x 2 jobe07 jobe 6 Apr 3 10:27 hsperfdata_jobe07 drwxr-xr-x 2 jobe08 jobe 6 Apr 3 10:27 hsperfdata_jobe08 drwxr-xr-x 2 jobe09 jobe 6 Apr 3 10:27 hsperfdata_jobe09 drwxr-x--- 2 kleu044 kleu044 6 Apr 4 14:03 hsperfdata_kleu044 drwxr-xr-x 2 www-data www-data 6 May 2 10:14 hsperfdata_www-data drwxrwxr-x+ 3 www-data www-data 127 May 2 10:15 jobe_gupliv drwxrwxr-x+ 3 www-data www-data 127 May 2 10:14 jobe_HIL5au -rw-r--r-- 1 www-data www-data 149 May 2 10:14 jobe_language_cache_file drwxrwxr-x+ 3 www-data www-data 127 May 2 10:15 jobe_Nq0ZyN drwxrwxr-x+ 3 www-data www-data 127 May 2 10:15 jobe_Q37ktS -rw------- 1 kleu044 kleu044 4.1K May 2 10:12 krb5cc_0 -rw-r--r-- 1 root root 993 May 2 10:18 last_run_summary.yaml drwx------ 2 root root 6 May 2 09:26 vmware-root compsci373prd03:/tmp$
Right, so those are clearly the task working directories, which should be created in /home/jobe/runs. Someone else has reported a similar problem, so I'd like to know why it's happening. I'm also puzzled that the directory isn't being deleted anyway, but let's worry about why the directory is in the wrong place, first.
The working directory is created in LanguageTask.php by the Task constructor, the code for which begins
$this->workdir = tempnam("/home/jobe/runs", "jobe_");`
It looks like something has gone wrong with the path or privileges of the directory /home/jobe/runs on your system. Could you give me a long directory listing of /home/jobe please.
Here is a long directory listing of /home/jobe. Basically it just contains an empty directory "runs".
root@compsci373prd03:/home# ls -lh jobe total 0 drwxrwx--x 2 jobe www-data 6 Jan 11 14:26 runs root@compsci373prd03:/home# tree jobe jobe └── runs
1 directory, 0 files root@compsci373prd03:/home# ls -lhd jobe drwx------ 3 jobe jobe 69 Jan 11 20:56 jobe root@compsci373prd03:/home#
I just had a look at this article:
http://php.net/manual/en/function.tempnam.php
It looks to me that for some reasons, the first parameter $dir of tempnam() does not work, so tempname reverts to the default "/tmp".
As you can see from above, /home/jobe is only accessible to the user "jobe", but not jobe00, jobe01, jobe02 etc., so does the install script also need to change permission of /home/jobe?
Ah, I see the problem. The directory /home/jobe isn't searchable by the webserver, so the working directory can't be created in /home/jobe/runs. Can you try changing the group of /home/jobe to www-data and the mode to 750, please. You should then find runs being created in /home/jobe/runs and being deleted when the run completes. If that solves the problem I'll include those changes in the installer. What OS are you using, by the way?
I made the changes by:
sudo chgrp -R www-data /home/jobe
sudo chmod -R 750 /home/jobe
However, it still doesn't work, because jobe00, jobe01 etc are in the group jobe, not www-data
compsci373prd03:/tmp$ groups jobe00 jobe00 : jobe compsci373prd03:/tmp$
However, when I switched /home/jobe back to group jobe, still doesn't work
I am using Ubuntu 16.04 LTS, but our ITS has imposed a much stricter access control defaults than the standard ubuntu.
In my account, the umask is 0007, and after I run "sudo su", the umask becomes 0077
OK, then how about just setting the mode of /home/jobe to 755 as in the default Ubuntu install?
I think I can't, because configurations are largely controlled by the puppet script system from the university, so things we write to /etc are usually overwritten by puppet.
I just did:
$ cd /home $ sudo chmod -R 755 jobe
jobe workers still can't write to /home/jobe/run
Then I changed the group ownership from jobe to www-data:
$ sudo chgrp -R www-data jobe
still doesn't work
Looks like www-data will need write access to /home/jobe/run.
At the moment, jobe, and its workers jobe00... are in group jobe, and /home/jobe is also owned by the jobe group, not www-data group, so no wonder why www-data can't write to /home/jobe/runs
So the solution is to make /home/jobe/run group-writable and group-owned by www-data.
However, will this have security concerns?
I can also verified that once I made the changes as in the solutions outlined, jobe workers now create temp directories in /home/jobe/runs, and delete the temp directories once the job is finished
I'm confused. www-data
should have had write access to that directory right from the start - that's how it's set up by the installer and your directory listing of it seemed to confirm that was OK:
drwxrwx--x 2 jobe www-data 6 Jan 11 14:26 runs
So exactly what changes have you made to get it working?
No.... /home/jobe/runs start with drwx------ and owned by jobe:jobe
so I need to change it to:
drwxrwx--- and owned by jobe:www-data
in order for www-data to write to /home/jobe/runs
On a cloned jobe VM, I moved the original /home/jobe to a different place, and then reran the install script:
$ cd /var/www/html/jobe $ sudo ./install
It created a new /home/jobe with drwxr-x--- owned by root:root
and /home/jobe/runs with drwxrwx--x owned by jobe:www-data
When I submitted jobs to the cloned VM, www-data also can't write to /home/jobe/runs
So in this case, www-data can't access /home/jobe/runs because
How come /home/jobe is set to be owned by root:root? Is it because I ran the install script under sudo, when there are already users jobe, jobe00 etc existing? In this case, we will need to explicitly set ownership of /home/jobe to jobe:www-data, right?
After running $ sudo /var/ww/html/jobe/install.sh
this is what I have:
jobetest01:/var/www/html/jobe$ sudo ls -lhd /home/jobe drwxr-x--- 3 root root 18 May 3 21:45 /home/jobe jobetest01:/var/www/html/jobe$ sudo ls -lh /home/jobe total 0 drwxrwx--x 2 jobe www-data 6 May 3 21:45 runs jobetest01:/var/www/html/jobe$
To get it working, I need to run:
$ sudo chmod a+rx /home/jobe
so that permission of /home/jobe becomes drwxr-xr-x
Now everything works.
Please ignore my previous comments. I remembered when I initially set up jobe, I encountered some problems so I manually changed permission. However, this message contains the permission and ownership information set up by install.sh, and the change that I need to get it working.
To sum it up, the only thing we need to add to install.sh is "chmod a+rx /home/jobe"
Resolved with change to install script.
jobe fails to delete temporary data stored in /tmp after the job finishes. In many VM systems, storage space in /tmp is very limited, so this bug causes the /tmp to fill up quickly, and making the VM unusable.
The script needs to ensure temporary data of a job is immediately deleted after the job ends and data is returned to the moodle