pmodels / mpich

Official MPICH Repository
http://www.mpich.org
Other
564 stars 279 forks source link

romio: MPI_File_open error with specified filesystem is not available #5772

Closed ericch1 closed 2 years ago

ericch1 commented 2 years ago

Hi,

Our nightly tests were ok on Jan 7 with commit a92ce59da2d429664570779d1e693288bd40e40f . But on Jan 8, we compiled with commit ea9b0c7e61602eecd08b6ca9953077beb173485a

and now we have a lot of errors like this one:

returned error: 1006693664
Other I/O error , error stack:
ADIO_RESOLVEFILETYPE(650): Specified filesystem is not available

issued by a MPI_File_open

everything is fine with mpich-3.2.1 and OpenMPI-3.x, OpenMPI-4.x and OpenMPI-5.x

Just saw the commit f77d6f71b902410ba99e9c01dd003f5afba7f0c0 from @roblatham00 which may be the problem?

Thanks a lot!

Eric here are all the build logs from MPICH and PETSc:

http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.08.05h36m02s_config.log http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.08.05h36m02s_config.system http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.08.05h36m02s_mpich_version.txt http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.08.05h36m02s_c.txt http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.08.05h36m02s_m.txt http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.08.05h36m02s_mi.txt http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.08.05h36m02s_mpl_config.log http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.08.05h36m02s_pm_hydra_tools_topo_config.log http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.08.05h36m02s_mpiexec_info.txt

http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.08.05h36m02s_configure.log http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.08.05h36m02s_make.log http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.08.05h36m02s_RDict.log http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.08.05h36m02s_make_test.log http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.08.05h36m02s_make_streams.log

roblatham00 commented 2 years ago

thanks for flagging this. it does seem like I'm at fault here... can you tell me more about the platform? I'm bothered that my integration testing did not find this.

ericch1 commented 2 years ago

I am on Suse Leap 15.3.

The MPI_File_open call is made on a file into a encrypted partition on a local disk (no NFS here).

I have 428 failures on 1555 tests. So some tests are able to open the files, other not...

I am investigating by hand 2 tests (let's say A and B):

A) i. If I launch with mpiexec -n 1 the test is ok! ii. I launched with mpiexec -n 2 valgrind ... but discovered only a few leaks, nothing harmful

B) If I launch with mpiexec -n 1 the test is still faulty but goes further: it fails on an "output" file instead of an input file... the error code is different (is it normal?): 1007217952 instead of 1006693664

I passed a MPI_INFO_NULL instead of the usual MPI_Info but it changed nothing...

Since I have compiled mpich into debug mode, I can follow the trace of the execution, maybe you can point me to where I should place a breakpoint?

roblatham00 commented 2 years ago

In my modifications I made a romio_statfs routine to hide all the many different ways operating systems do stat. I would look at that routine -- the output value file_id should be a file system "magic value" like the ones enumerated in https://github.com/pmodels/mpich/blob/main/src/mpi/romio/adio/common/ad_fstype.c#L73 or UNKNOWN_SUPER_MAGIC which should tell ROMIO "i could not find anything specific so I will treat this like a generic POSIX file system".

ADIO_FileSysType_fncall calls romio_statfs on either the file or the file's parent directory. Its role is to turn the file system magic value (or UNKNOWN_SUPER_MAGIC) into the appropriate ROMIO file system identifier

(now that I think about it , there's no need for both file system magic values and ROMIO file system identifiers... inelegant but not defective, I think?)

Once we have that identifier we can search the 'fstypes' table to map identifers to table of function pointers. mapping is https://github.com/pmodels/mpich/blob/main/src/mpi/romio/adio/common/ad_fstype.c#L160 searching the map is done in e.g. https://github.com/pmodels/mpich/blob/main/src/mpi/romio/adio/common/ad_fstype.c#L642

Now that I type all that out... phew what a mess.

roblatham00 commented 2 years ago

It sounds from your description (fails on output file) that I need to look more closely at the "check parent dir if given file is not found" logic (https://github.com/pmodels/mpich/blob/main/src/mpi/romio/adio/common/ad_fstype.c#L401)

ericch1 commented 2 years ago

Hmm, talking of "parent dir" reminds me that since a long time ago, we got an issue with MPI_File_open with very long paths... since then we always call chdir before any call to MPI_File_open... and after we switch back to the pwd with chdir (the ticket was in 2014: https://github.com/pmodels/mpich/issues/2212)...

ericch1 commented 2 years ago

ok, just tried a very simple example on my system and it just fail at "MPI_File_open" call with the compiled mpich/master...

See source code fo.c (as fo.txt): fo.txt

If I source mpich/master compiled/configured as above:

source /opt/mpich-3.x_debug/mpilibs.sh 
(20:59:49) [lorien]:tmp> which mpicc
/opt/mpich-3.x_debug/bin/mpicc
(20:59:55) [lorien]:tmp> mpicc -o fo fo.c 
(21:00:03) [lorien]:tmp> ./fo 
Unable to open file "temp"
MPI_Error_string: Other I/O error , error stack:
ADIO_RESOLVEFILETYPE(650): Specified filesystem is not available 

with older mpich 3.3.2:

source /opt/mpich-3.3.2/mpilibs.sh 
(21:01:32) [lorien]:tmp> which mpicc
/opt/mpich-3.3.2/bin/mpicc
(21:01:36) [lorien]:tmp> mpicc -o fo fo.c 
(21:01:39) [lorien]:tmp> ./fo 
Success
roblatham00 commented 2 years ago

that's a wonderfully simple test case. Of course it prints 'Success' for me and not an error.

Can you run stat -f on this directory? for example on my laptop I see

% stat -f .
  File: "."
    ID: fedc9aa3bd65bc57 Namelen: 255     Type: ext2/ext3
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 65793553   Free: 36657953   Available: 33298414
Inodes: Total: 16777216   Free: 15579348

or on my worksatation I see

% stat -f ${HOME}
  File: "/home/robl"
    ID: 0        Namelen: 255     Type: nfs
Block size: 32768      Fundamental block size: 32768
Blocks: Total: 1638400    Free: 1556782    Available: 1556782
Inodes: Total: 99727275   Free: 99634022

both of which seem to handle your case just fine

ericch1 commented 2 years ago

Here I have:

./fo 
Unable to open file "temp"
MPI_Error_string: Other I/O error , error stack:
ADIO_RESOLVEFILETYPE(650): Specified filesystem is not available 
(10:27:31) [lorien]:tmp> stat -f .
  File: "."
    ID: fe0000000000 Namelen: 255     Type: xfs
Block size: 4096       Fundamental block size: 4096
Blocks: Total: 244070463  Free: 151390632  Available: 151390632
Inodes: Total: 488379392  Free: 486352124
ericch1 commented 2 years ago

It works on NFS:

./fo 
Success

(10:30:44) [lorien]:~> stat -f .
  File: "."
    ID: 0        Namelen: 255     Type: nfs
Block size: 1048576    Fundamental block size: 1048576
Blocks: Total: 1877657    Free: 343916     Available: 248513
Inodes: Total: 122101760  Free: 86046440
roblatham00 commented 2 years ago

hah! thank you that must be the key. 'xfs' is a special case: a file system that might contain XFS-specific optimizations (some work our SGI friends contributed ...back when SGI was a thing) but should normally be treated like a regular posix file system.

OK, i can patch this up thanks for the information!

roblatham00 commented 2 years ago

give https://github.com/pmodels/mpich/pull/5781 a shot . I'm running it through the integration tests now.

ericch1 commented 2 years ago

give #5781 a shot . I'm running it through the integration tests now.

It is now working for the little test shared with you. :)

I now test our whole test suite and will comme back in a few hours...

Thanks a lot!

ericch1 commented 2 years ago

@roblatham00 all of tests are ok now with your fix, thanks a lot! :)

http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.22.07h42m08s_config.log http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.22.07h42m08s_config.system http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.22.07h42m08s_mpich_version.txt http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.22.07h42m08s_c.txt http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.22.07h42m08s_m.txt http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.22.07h42m08s_mi.txt http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.22.07h42m08s_mpl_config.log http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.22.07h42m08s_pm_hydra_config.log http://www.giref.ulaval.ca/~cmpgiref/dernier_mpich/2022.01.22.07h42m08s_mpiexec_info.txt