CICE-Consortium / CICE

Development repository for the CICE sea-ice model
Other
60 stars 132 forks source link

PIO createfile where path is a symlink #950

Open anton-seaice opened 6 months ago

anton-seaice commented 6 months ago

In the UFS model configuration, the output files can be configured as symlinks which are created before the model is run.

e.g. a symlink is created to point to the desired locations of history / restart files, but the target of the symlink doesn't exist yet. CICE creates the actual file during the model execution.

@DeniseWorthen has found that using _format = 'hdf5' this fails with this error:

 (ice_pio_init) ERROR: Failed to create file .

This is because : https://github.com/CICE-Consortium/CICE/blob/0af031d785d3bc622cd19af48a2e9465b5abe9a0/cicecore/cicedyn/infrastructure/io/io_pio2/ice_pio.F90#L200

gives exists=False where filename is a symlink, even if the target is not does not exist. Therefore the file is created with "NOCLOBBER" and it fails. (EDIT: It seems like the inquire in fortran, and the pio library or its dependencies intepret symlinks differently. inquire returns exists = False, but creating with NOCLOBBER fails.)

We think we can remove this "NOCLOBBER" case, and create the file with "CLOBBER" irrespective of whether it already exists=True or not. This works with modern PIO (e.g. 2.6.2) builds.

We should check with older PIO builds also and possibly raise with the PIO developers (as its not clear why you can't create with NOCLOBBER).

anton-seaice commented 3 months ago

I tried commenting out the if (exists) statement and always creating with lclobber.

This got further, but failed with

get_stripe failed: 61 (No data available)

Which I think it a lustre specific issue with symlinks (https://github.com/open-mpi/ompi/issues/12141, and so should be fixed in openmpi 5.x.x and openmpi 4.1.7 if/when it gets released.)

I then change the iotype to romio , which might be more robust than the default ompi and the fix worked.

(in cice.run I set this: mpirun -n 4 --mca io romio321 ./cice >&! $ICE_RUNLOG_FILE)

@DeniseWorthen - Is UFS using openmpi and a lustre file system ?

I don't see any downside to remove the if (exists) statement, so we can still do it. I can submit a PR if that makes it easier to dicuss.

@apcraig - is it feasible to make a cice test case where we create a symlink where the history & restart files go before running cice ?

anton-seaice commented 3 months ago

This is the proposed change: https://github.com/CICE-Consortium/CICE/compare/main...ACCESS-NRI:CICE:link_creates

DeniseWorthen commented 3 months ago

@anton-seaice We run on a whole set of Tier-1 platforms + the operational platforms. I'm not sure which might use a lustre file system. Let me ask around.