gridcf / gct

Grid Community Toolkit
Apache License 2.0
46 stars 30 forks source link

globus-job-run fails because the job manager failed to create an internal script argument file #184

Open longlong10086 opened 2 years ago

longlong10086 commented 2 years ago

I met an error “because the job manager failed to create an internal script argument file” when executing the command "globus-job-run". How to fix it?Hope to get your help.

command

globus-job-run node102/jobmanager-fork-poll -np 1 /bin/hostname

node102 system

Red Hat Enterprise Linux Server release 6.3 (Santiago)

kernel

Linux centos7-102 3.10.0-1062.el7.x86_64 #1 SMP Wed Aug 7 18:08:02 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

gatekeeper.log

PID: 100 -- Notice: 0: executing /home/linux/globus.all/usr/sbin/globus-job-manager TIME: Sun Apr 10 22:41:23 2022 PID: 100 -- Notice: 0: GRID_SECURITY_HTTP_BODY_FD=10 TIME: Sun Apr 10 22:41:23 2022 PID: 100 -- Notice: 0: GRID_SECURITY_HTTP_BODY_FD=10 TIME: Sun Apr 10 22:41:23 2022 PID: 101 -- Notice: 0: Set CONTENT_LENGTH=353 TIME: Sun Apr 10 22:41:23 2022 PID: 101 -- Notice: 0: Set GATEWAY_INTERFACE to CGI/1.1 TIME: Sun Apr 10 22:41:23 2022 PID: 101 -- Notice: 0: Set CONTENT_LENGTH=353 TIME: Sun Apr 10 22:41:23 2022 PID: 101 -- Notice: 0: Set GATEWAY_INTERFACE to CGI/1.1 TIME: Sun Apr 10 22:41:23 2022 PID: 101 -- Notice: 0: Set SERVER_NAME to node102 TIME: Sun Apr 10 22:41:23 2022 PID: 101 -- Notice: 0: Set SERVER_PORT to 2119 TIME: Sun Apr 10 22:41:23 2022 PID: 101 -- Notice: 0: Set SERVER_NAME to node102 TIME: Sun Apr 10 22:41:23 2022 PID: 101 -- Notice: 0: Set SERVER_PORT to 2119 TIME: Sun Apr 10 22:41:23 2022 PID: 100 -- Notice: 0: Read 146 bytes from proxy pipe TIME: Sun Apr 10 22:41:23 2022 PID: 100 -- Notice: 0: Child 101 started TIME: Sun Apr 10 22:41:23 2022 PID: 100 -- Notice: 0: Read 146 bytes from proxy pipe TIME: Sun Apr 10 22:41:23 2022 PID: 100 -- Notice: 0: Child 101 started

right logs

There should be other logs in the back

image

fscheiner commented 2 years ago

I remembered the suggestion by @maarten-litmaath on our discuss@gridfcf.org list to have a look at the following page:

https://wiki.egi.eu/wiki/Tools/Manuals/TS68

Please check if the information there helps.


Apart from that but also "included" above, the GRAM documentation states the following:

Error Code | Reason | Possible Solutions
[...]
22 | the job manager failed to create an internal script argument file | Check that your home directory is writable and not full.
[...]

So maybe the directory permissions are configured wrongly for the actual user on node102, which prevents the creation of the script argument file.

longlong10086 commented 2 years ago

@fscheiner Thanks for your help. There is no problem with my home directory . My globus user has no permission to /tmp (root ,700) . So I modified the source code in globus_gatekeeper.c to use $TMPDIR as the tmp directory .
And then I met the error "failed to create an internal script argument file " .

But I tried again last few days ,the error change to “GRAM Job submission failed because the job manager cannot find the user proxy (error code 29)” .

And now my question is what to do when my globus user has no permission to /tmp .