High Resolution runs only work with pnetcdf/1.5.0

vanroekel commented 7 years ago

In running an MPAS-O case with approximately 3.6 million cells (RRS18to6) our log.err file has numerous instances of an error "Bad return value from PIO" and the file written has size 0. If we use pnetcdf/1.5.0, this does not happen, the output looks reasonable and valid (verified with ncdump and visualizing with paraview).

After digging through the framework and comparing pnetcdf versions, it appears pnetcdf/1.5.0 works because there was a bug in that version that was remedied in later versions. In MPAS, we use NC_64BIT_OFFSET by default for output For cdf-2 files, any single variable cannot exceed 4GB in size. Any variable in my 18to6 run that is dimensioned nedges by nVertLevels (e.g., normalVelocity) has a size of ~8GB and thus violates this constraint. In pnetcdf/1.5.0 only the variable dimensions were accounted for and there was no consideration for the size of an element. This allowed us to pass the size check and proceed to file writes. This was remedied in pnetcdf/1.6.0 and we can no longer write using NC_64BIT_OFFSET. I still do not understand why I get valid output for an array that violates cdf-2 constraints, and am communicating with pnetcdf developers on this (see the discussion at https://trac.mcs.anl.gov/projects/parallel-netcdf/ticket/29). However, I think the more appropriate solution is to switch the default output to NC_64BIT_DATA (cdf-5), or at least allow easier use of this option. From what I can tell in the framework there is not an easy way to use NC_64BIT_DATA. If I look at this block from mpas_io.F

      if (ioContext % master_pio_iotype /= -999) then
         pio_iotype = ioContext % master_pio_iotype
         pio_mode = PIO_64BIT_OFFSET
      else
         if (ioformat == MPAS_IO_PNETCDF) then
            pio_iotype = PIO_iotype_pnetcdf
            pio_mode = PIO_64BIT_OFFSET
         else if (ioformat == MPAS_IO_PNETCDF5) then
            pio_iotype = PIO_iotype_pnetcdf
            pio_mode = PIO_64BIT_DATA
         else if (ioformat == MPAS_IO_NETCDF) then
            pio_iotype = PIO_iotype_netcdf
            pio_mode = PIO_64BIT_OFFSET
         else if (ioformat == MPAS_IO_NETCDF4) then
            pio_iotype = PIO_iotype_netcdf4p
            pio_mode = PIO_64BIT_OFFSET
         end if
      end if

I can only get to the 64BIT_DATA option if master_pio_iotype is set. Yet I can't seem to find where this happens. I see no calls to MPAS_io_set_iotype. Am I missing it? Or is 64BIT_OFFSET currently the only option for output? If so, is it possible to change this? 64BIT_DATA seems to work in my tests with modified framework, but don't know if I'm missing something else about why output is only written in cdf-2.

vanroekel commented 7 years ago

@mgduda could you comment on this when you have a moment? This is a fairly high priority ACME need.

mark-petersen commented 7 years ago

@mgduda I just put this on the agenda for the Mon 4/10 telecon.

mark-petersen commented 7 years ago

From @vanroekel: One other thing regarding the PIO errors, if we do switch to cdf5 output, we must require netcdf 4.4.0 or above. This is default on LANL IC and titan at least. But does not exist on rhea.

mgduda commented 7 years ago

@vanroekel , @mark-petersen Sorry for not replying earlier! We've observed the same behavior regarding pnetCDF versions as well: 1.5.0 seems to allow us to write large variables (as long as any individual MPI task does not write more than 4 GB of the variable), but newer versions of pnetCDF stop us with a format constraint error. My guess as to why this results in a "valid" output file is that the netCDF format never explicitly stores the size of a variable or record, but only the dimensions of that variable; so, as long as one only reads back <4 GB at a time, there may not be problems.

At least for stand-alone MPAS models, selecting PIO_64BIT_DATA is simply a matter of adding io_type="pnetcdf,cdf5" to the definition of any stream, e.g.,

<stream name="output"
        type="output"
        io_type="pnetcdf,cdf5"
        filename_template="history.$Y-$M-$D_$h.$m.$s.nc"
        output_interval="6:00:00" >

        <var name="hugevariable1"/>
        <var name="hugevariable2"/>
        <var name="hugevariable3"/>
</stream>

If ACME is using the standard XML stream definition files, I suspect that just adding this attribute would work for you all, too.

Just for reference, at the end of section 5.2 in the atmosphere users' guide (perhaps in a different section of other users' guides) is a description of the io_type attribute.

As was noted on our developers' telecon today, though, post-processing CDF5 / 64BIT_DATA files can be tricky. As a method of last resort, I have some Fortran code that links against the regular netCDF and pnetCDF libraries and could be modified to serve as a converter between CDF5 and CDF2 or HDF5 formats.

matthewhoffman commented 7 years ago

@vanroekel , @mark-petersen , it sounds like in the case of the RRS18to6 mesh, it might be that the only commonly used field that violates the 4GB restriction is normalVelocity. If so, then we might be able to avoid the headaches of postprocessing with CDF5 format files, because the history files could still use CDF2 format (I'm assuming normalVelocity is not of interest for history files). (I.e., just the restart stream could be set to use CDF5)

vanroekel commented 7 years ago

@matthewhoffman this sounds like a reasonable plan I think velocityZonal and velocityMerdional are in all other outputs. I will test this in the 18to6.

xylar commented 7 years ago

@vanroekel and @matthewhoffman, if we ever want to compute the Meridional Overturning Circulation with post processing from the RRS18to6 (which we very likely don't), we would need normalVelocity to do so. I'm just mentioning it now because I want to make sure we're removing that possibility with eyes open. @milenaveneziani?

vanroekel commented 7 years ago

All, as I note above, I have no problems with analysis of CDF5 data as long as the netcdf is up to date (>= 4.4.0). For example, all output for my G-case High Res is CDF5 and I can run analysis with no issues as the netcdf is v4.4.1, but it won't work on rhea, as netcdf is 4.3.3.1. Once machines have netcdf upgraded, I don't foresee problems. But I may be missing some headaches noted by @matthewhoffman and @mgduda. I was told by pnetcdf developers that the newer versions of nco and netcdf fully support CDF5, so I suspect these headaches won't be long term.

milenaveneziani commented 7 years ago

If possible, I think it would be a good think to keep outputting normalVelocity until we test the MOC am at low- and high-resolution, and we make sure things work as expected.

vanroekel commented 7 years ago

It is fine with me to leave normalVelocity, the only caveat is that the analysis will not work on rhea until netcdf is upgraded. I have put in a request already.

maltrud commented 7 years ago

i agree with Milena--we need to keep normalVelocity. in the meantime, it sounds like Michael's code would be nice. my workaround today (for small files) is to ncdump to an cdl file on a 4.4.1 machine (eg titan), then ncgen that file back to netcdf using 4.4.0 (eg, rhea).

matthewhoffman commented 7 years ago

A short term workaround might be to stick normalVelocity in its own output file with CDF5 format. That way the primary output will still work with older netCDF libraries, but you'd still have normalVelocity around if needed.

vanroekel commented 7 years ago

The problem with moving normal velocity is that it will require timeSeriesStatsMonthly to be split as well. This is the file and normalVelocity that is used in MPAS-Analysis. Is that correct @milenaveneziani ?

mark-petersen commented 7 years ago

@vanroekel Could you re-run an output test with the RRS18to6 to confirm that

<stream name="output"
        type="output"
        io_type="pnetcdf,cdf5"

works with pnetcdf >1.5.0, without any code changes for PIO_64BIT_OFFSET? You would need it on output and restart stream. I think that is where to start on this. Thanks a lot, I know you are getting lots of "please test this" requests.

vanroekel commented 7 years ago

@mark-petersen I can not get this to work in ACME, it seems that PIO_TYPENAME supersedes all io_type choices, and pnetcdf,cdf5 is not a valid option for ACME right now. @jonbob has found similar behavior in his tests.

jonbob commented 7 years ago

@vanroekel - I found something in the driver that sets a master pio_type, which then controls all pio output from there. I have a test to try tomorrow, so we can chat first thing

vanroekel commented 7 years ago

great! I was just chatting with @mark-petersen about this and we found the same and are going to try as well.

jonbob commented 7 years ago

Ah, great minds and all....

mark-petersen commented 7 years ago

@vanroekel and @jonbob I just ran the exact restart test on MPAS-O on wolf, with netcdf 4.4.0, using the flag io_type="pnetcdf,cdf5" on the restart stream, on the current ocean/develop. It can write and read correctly, and passes the b-f-b comparison between the 8 hr and the 4hr+restart+4hr runs. ncdump -k confirms the cdf5 format. Modules:

wf273.localdomain> module list
Currently Loaded Modulefiles:
  1) git/2.11.0                     3) intel/15.0.5                   5) /netcdf/4.4.0                  7) /pio/1.7.2
  2) /python/anaconda-2.7-climate   4) openmpi/1.6.5                  6) /parallel-netcdf/1.5.0

This did not work on grizzly, but just because the netcdf is an early version by mistake:

gr-fe3.lanl.gov> module list
Currently Loaded Modules:
  1) python/anaconda-2.7-climate   2) gcc/5.3.0   3) openmpi/1.10.5   4) netcdf/4.4.1   5) parallel-netcdf/1.5.0   6) pio/1.7.2
gr-fe3.lanl.gov> which ncdump
/usr/projects/climate/SHARED_CLIMATE/software/grizzly/netcdf/4.4.1/gcc-5.3.0/bin/ncdump
gr-fe3.lanl.gov> ncdump
...
netcdf library version 4.3.3.1 of Apr  5 2017 22:24:50 $

In other words, we should proceed and try this in ACME. I can try on edison.

jonbob commented 7 years ago

@mark-petersen - that's great, but do we need to check the netcdf version somehow before changing the setting? I've been mucking about in the scripts and figured out how to do most of what we need, but that would complicate it....

vanroekel commented 7 years ago

@mark-petersen, In checking netcdf is highly variable in its versions on IC. only the intel15.0.5 version will work. All others have netcdf c bindings of 4.3.2. So it seems like a check of version will be needed unfortunately.

mark-petersen commented 7 years ago

@vanroekel and @jonbob Good news. I ran 1 day + restart + 1 day on edison, EC60to30v3 G case. Once with default settings, once with restart stream flag io_type="pnetcdf,cdf5". It works, and I get bfb at the final time between the two runs. I've confirmed that the restart file is cdf5:

edison02> pwd
/scratch2/scratchdirs/mpeterse/acme_scratch/edison/a18i/run
edison02> ncdump -k mpaso.rst.0001-01-03_00000.nc
cdf5

The only change needed was:

diff --git a/components/mpas-o/driver/ocn_comp_mct.F b/components/mpas-o/driver/ocn_comp_mct.F
index 59e1926..d1a5c3c 100644
--- a/components/mpas-o/driver/ocn_comp_mct.F
+++ b/components/mpas-o/driver/ocn_comp_mct.F
@@ -297,7 +297,8 @@ contains
     io_system => shr_pio_getiosys(ocnid)

     pio_iotype = shr_pio_getiotype(ocnid)
-    call MPAS_io_set_iotype(domain % iocontext, pio_iotype)
+!!! mrp do not set pio type from PIO_TYPENAME in env_run.xml
+!!!    call MPAS_io_set_iotype(domain % iocontext, pio_iotype)

This is compiled on edison with intel, so libraries are:

    <modules compiler="intel">
      <command name="load">PrgEnv-intel</command>
      <command name="rm">intel</command>
      <command name="load">intel/15.0.1.133</command>
      <command name="rm">cray-libsci</command>
    </modules>
    <modules mpilib="!mpi-serial">
      <command name="load">cray-netcdf-hdf5parallel/4.4.0</command>
      <command name="load">cray-hdf5-parallel/1.8.16</command>
      <command name="load">cray-parallel-netcdf/1.6.1</command>
    </modules>

jonbob commented 7 years ago

@vanroekel - does this issue get closed with ACME PR #1456? Or should we test first?

vanroekel commented 7 years ago

@jonbob that PR does fix it the issue for ACME, but this issue does remain on the MPAS stand alone side. I'm not how we should address the issue for MPAS-O only simulations on the 18to6 mesh. any thoughts? I don't think modifying streams.ocean in default inputs is wise, perhaps this is a modification to the testing infrastructure? Pinging @mark-petersen as well.

mark-petersen commented 7 years ago

I think io_type="netcdf" should be the default in Registry.xml in MPAS-O stand along for all non-partitioned AM output streams. I can't think of any reason not to - can anyone else?

I would leave everything else as io_type="pnetcdf,cdf2" for now because we don't have cdf5 tools everywhere. We could put cdf5 in the testing scripts for RRS18to6 restart.

vanroekel commented 7 years ago

I agree with what @mark-petersen proposes for stand alone, especially as LANL does not yet have cdf5 capabilities.

MPAS-Dev / MPAS

High Resolution runs only work with pnetcdf/1.5.0 #1279