In PnetCDF, the nc_test has a number of quick running tests that each create a scratch file, execute I/O operations on that file, and then delete the scratch file. The consecutive sequence of tests shown here complete quickly and all use the same filename for the scratch file.
The MPI_File_open() call of one of these tests fails when it attempts to sync extents with the server during an internal call to close(). ROMIO's MPI_File_open() calls both open() and close() on the file. The close() then tries to sync extents with the server because the file had been opened for writing.
That unlink() invokes an unlink rpc from client-to-server, which induces a later server-to-client unlink callback rpc that comes back to the client from the server at some future point in time. In this case, the unlink callback fires in the middle of MPI_File_open's open() and close() calls. The open() successfully recreates the file, then it is deleted due to the unlink callback, and then the close() fails to sync extents, which causes MPI_File_open to return an error:
In PnetCDF, the
nc_test
has a number of quick running tests that each create a scratch file, execute I/O operations on that file, and then delete the scratch file. The consecutive sequence of tests shown here complete quickly and all use the same filename for the scratch file.https://github.com/Parallel-NetCDF/PnetCDF/blob/6c71a30cd95f575c01025c0c926fc06dc9157774/test/nc_test/nc_test.c#L414-L420
The
MPI_File_open()
call of one of these tests fails when it attempts to sync extents with the server during an internal call toclose()
. ROMIO'sMPI_File_open()
calls bothopen()
andclose()
on the file. Theclose()
then tries to sync extents with the server because the file had been opened for writing.https://github.com/Parallel-NetCDF/PnetCDF/blob/6c71a30cd95f575c01025c0c926fc06dc9157774/test/nc_test/test_write.m4#L394
The extent sync fails because the file has been deleted so that
meta->fid (-1) != fid (2)
at this check:https://github.com/LLNL/UnifyFS/blob/a13edaf779c755b3314f1bd7ce7f798d532d9951/client/src/unifyfs_fid.c#L1086-L1096
This situation happens because the prior test deleted the scratch file via
MPI_File_delete() -> unlink()
.https://github.com/Parallel-NetCDF/PnetCDF/blob/6c71a30cd95f575c01025c0c926fc06dc9157774/test/nc_test/test_write.m4#L437
That
unlink()
invokes an unlink rpc from client-to-server, which induces a later server-to-client unlink callback rpc that comes back to the client from the server at some future point in time. In this case, the unlink callback fires in the middle ofMPI_File_open
'sopen()
andclose()
calls. Theopen()
successfully recreates the file, then it is deleted due to the unlink callback, and then theclose()
fails to sync extents, which causesMPI_File_open
to return an error:https://github.com/LLNL/UnifyFS/blob/a13edaf779c755b3314f1bd7ce7f798d532d9951/client/src/unifyfs-sysio.c#L2322-L2330