Open toomanycats opened 3 weeks ago
At first this could be an error caused by a MAC solution. Can you check if AppArmor or SELinux could be the culprit, i.e. disabling either one of those and seeing if the error disappears.
That's a good idea but it didn't help. I set selinux
into permissive mode, rebooted and received the same error.
This new storage is a cluster so I was hoping that might work.
What do you think about this function: sge_filecmp
in source/libs/uti/sge_io.c
line 166.
/****** uti/io/sge_filecmp() **************************************************
1 * NAME
2 * sge_filecmp() -- Compare two files
3 *
4 * SYNOPSIS
5 * int sge_filecmp(const char *name0, const char *name1)
6 *
7 * FUNCTION
8 * Compare two files. They are equal if:
9 * - both of them have the same name
10 * - if a stat() succeeds for both files and
11 * i-node/device-id are equal
Not sure, but given that the error message says explicitly Permission denied
I would assume the error is somewhere in the file system permissions.
I'm tracking down a very obscure error, where about 30% of submitted jobs, go into the
Eqw
state. The error is always the same,We thought this was due to using a brand new storage appliance. However, when permissions are get wide open there's no change in the behavior. I've captured NFS traffic and been analyzing it in Wireshark. I don't see any
FSSTAT
failling.I'm wondering, if the SGE daemon creates the
stdout
andstderr
file in the sge root directory and the client then copies it out ??Any ideas are appreciated.