GCEL / ILAMB

(This is the GCEL development fork of ILAMB - it is not intended to replace the ornl bitbucket site which should be referred to first.)
2 stars 1 forks source link

MPI error - file lock #14

Closed dvalters closed 5 years ago

dvalters commented 5 years ago
(ilamb27)[dvalters@racadal INLAND_TEST]$ ilamb-run --config sample.cfg --model_root $ILAMB_ROOT/MODELS/ --regions global --clean

Searching for model results in /exports/csce/datastore/geos/groups/gcel/ILAMB_runs_output/INLAND_TEST/MODELS/

                                           INLAND
                                            JULES

Parsing config file sample.cfg...

           GrossPrimaryProductivity(GPP)/CARDAMOM Initialized
               NetPrimaryProduction(NPP)/CARDAMOM Initialized
File locking failed in ADIOI_Set_lock(fd 4,cmd F_SETLKW/7,type F_WRLCK/1,whence 0) with return value FFFFFFFF and errno 25.
- If the file system is NFS, you need to use NFS version 3, ensure that the lockd daemon is running on all the machines, and mount the directory with the 'noac' option (no attribute caching).
- If the file system is LUSTRE, ensure that the directory is mounted with the 'flock' option.
ADIOI_Set_lock:: No locks available
ADIOI_Set_lock:offset 0, length 8
--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD 
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
-----------------------------------------------------------
dvalters commented 5 years ago

Allegedly this is down to our NFS not supporting a file lock? (since Monday????)

dvalters commented 5 years ago

Fixed due to resolving Files system errors

dvalters commented 5 years ago

Explanation for future reference:

thanks for the update, we recently upgraded datastore frontend presentation to a newer version (samba / nfs). It is turned out that we had a problem with nfs statd (nfs file locking) This problem was fixed on Friday the 7th.