NOAA-EMC / hpc-stack

Create a software stack for HPC's
GNU Lesser General Public License v2.1
30 stars 36 forks source link

hpc-stack probably should install pnetcdf and enable it in PIO by default... #102

Open edwardhartnett opened 3 years ago

edwardhartnett commented 3 years ago

I'm still getting used to hpc-stack, but I see this in config/stack_noaa.yaml:

pio:
  build: YES
  version: 2.5.1
  enable_pnetcdf: NO
  enable_gptl: NO

Probably you want to build PIO with pnetcdf. It's the best performing parallel I/O layer, for classic format netCDF files. (It won't handle netCDF/HDF5 files.)

edwardhartnett commented 3 years ago

@aerorahul did you add the PIO build?

aerorahul commented 3 years ago

Yes. I had. Sure we can enable pnetcdf. We don't use pnetcdf in UFS and hence it was OFF. But since the stack is serving a bigger purpose, I agree we should turn in ON.

edwardhartnett commented 3 years ago

Does this mean we also have to install pnetcdf?

aerorahul commented 3 years ago

yep!

edwardhartnett commented 3 years ago

Well if you are installing PIO, then pnetcdf should also be installed.

edwardhartnett commented 3 years ago

@aerorahul are we installing pnetcdf now? And is PIO using it by default?

aerorahul commented 3 years ago

We are not installing PNetCDF since it is not used anywhere in the UFS or JEDI applications. PIO is not building with PNetCDF in hpc-stack (since PNetCDF is not being installed).

I will let the UFS developers tell us if they wish to use PIO (with PNetCDF or without) The UFS is producing the HDF5/netCDF4 files with parallelization and compression. @junwang-noaa @DusanJovic-NOAA @DeniseWorthen Please comment. I am not sure if PIO when built with PNetCDF will reproduce existing UFS baselines.

aerorahul commented 3 years ago

@junwang-noaa @DusanJovic-NOAA @DeniseWorthen Please let us know if you want PIO with or without PnetCDF. I will either resolve or close this issue at the end of the week depending on your response/non-response.

DusanJovic-NOAA commented 3 years ago

The UFS components I'm familiar with (fv3, fms, ccpp etc) are not using PIO (nor PNetCDF).

junwang-noaa commented 3 years ago

I am not clear what components will use the capability of PIO built with Pnetcdf and how it benefits UFS. Can someone who made the request show some results?

On Tue, Feb 16, 2021 at 10:50 AM Rahul Mahajan notifications@github.com wrote:

@junwang-noaa https://github.com/junwang-noaa @DusanJovic-NOAA https://github.com/DusanJovic-NOAA @DeniseWorthen https://github.com/DeniseWorthen Please let us know if you want PIO with or without PnetCDF. I will either resolve or close this issue at the end of the week depending on your response/non-response.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/hpc-stack/issues/102#issuecomment-779925780, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI7D6TM2QSURLB4JSLJGG4LS7KH2XANCNFSM4T53CZ4Q .

aerorahul commented 3 years ago

@edwardhartnett wants to build PIO with PNetCDF in the hpc-stack. PIO uses PnetCDF for writing out netCDF classic format files. NetCDF4 does not use PnetCDF. I am sure there is more to it.

There is currently no application in the UFS (or JEDI), that uses PIO with PNetCDF. Both applications leverage netCDF-4 with HDF5 layer to achieve parallelism.

@edwardhartnett Do you want to take this on?

edwardhartnett commented 3 years ago

@junwang-noaa I am one of the authors of PIO and did a paper/poster (with Jim Edwards) for the AMS with explanation and some performance analysis: https://www.researchgate.net/publication/348169990_THE_PARALLELIO_PIO_CFORTRAN_LIBRARIES_FOR_SCALABLE_HPC_PERFORMANCE

I'm not really familiar with why PIO ended up in our hpc-stack. Some NOAA or NCAR models are using it, apparently. (CIME does, IIRC.)

When PIO is used, pnetcdf is a very helpful and important option. It allows users to use the parallel-netcdf library for very performant I/O on classic only files.

My suggestion is, since we are building PIO on hpc-stack, we should include the pnetcdf option to PIO. It may be necessary for some PIO users, although they may not have articulated that need, because it operates under the hood. But if anyone is using PIO to access classic files, they want pnetcdf as well.

edwardhartnett commented 3 years ago

Seemingly we do have some users interested in pnetcdf. Email from Ufuk Turuncoglu at UCAR:

Currently CMEPS (mediator) and CDEPS (data components) use PIO to read and write data. I am currently working on HSUP project that aims to create coupled model for HAFS application and this also uses CDEPS as data components. Also note that, If you want to get leverage of pnetcdf (parallel IO) under PIO, PIO needs to be installed with pnetcdf support which is not currently available in the current HPC-stack installations. The latest version of PIO 2.5.3 has some cmake build fix in terms of pnetcdf support. Anyway, let me know if you need more information.

junwang-noaa commented 3 years ago

Ed. is it possible we move to netCDF-4 with HDF5 layer in PIO? As you know the netCDF-4 with HDF5 layer has advanced features such as parallelization and compression, which is more advanced than pnetcdf in my opinion, please correct me if I am wrong. Do we need to add another dependency of pnetcdf?

On Wed, Mar 24, 2021 at 1:58 PM Edward Hartnett @.***> wrote:

Seemingly we do have some users interested in pnetcdf. Email from Ufuk Turuncoglu at UCAR:

Currently CMEPS (mediator) and CDEPS (data components) use PIO to read and write data. I am currently working on HSUP project that aims to create coupled model for HAFS application and this also uses CDEPS as data components. Also note that, If you want to get leverage of pnetcdf (parallel IO) under PIO, PIO needs to be installed with pnetcdf support which is not currently available in the current HPC-stack installations. The latest version of PIO 2.5.3 has some cmake build fix in terms of pnetcdf support. Anyway, let me know if you need more information.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/hpc-stack/issues/102#issuecomment-806033825, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI7D6TIIGXVQZUOM6UVEC6LTFIQ7FANCNFSM4T53CZ4Q .

aerorahul commented 3 years ago

I will add that JEDI also used netcdf4 with HDF5 for parallelisation and compression.

edwardhartnett commented 3 years ago

More info from Ufuk:

My user name is @uturuncoglu.

CMEPS is the mediatory used by the UFS Weather Model to coupled different components. Here is the links,

https://github.com/ESCOMP/CMEPS - top level authority repo https://github.com/hafs-community/CMEPS - HAFS fork https://github.com/NOAA-EMC/CMEPS - NOAA EMC fork

CDEPS provides data component through the ESMF/NUOPC layer,

https://github.com/ESCOMP/CDEPS - top level repo https://github.com/NOAA-EMC/CDEPS - NOAA EMC fork https://github.com/hafs-community/CDEPS - HAFS fork

Pnetcdf is only used if someone try to change default PIO namelist option and use pnetcdf rather than netcdf (default). You could see the some timing results that compares netcdf vs. pnetcdf in the following old PR (just check thread),

https://github.com/ESCOMP/CMEPS/pull/158

This could add extra performance to UFS model interims of I/O from mediator and data components.

—ufuk

edwardhartnett commented 3 years ago

@junwang-noaa to answer your questions: 1 - Yes, PIO also allows you to use netCDF4/HDF5 for output. PIO supports netCDF4/HDF5 files with the netcdf-c library, and netCDF classic format files with either the netcdf-c library, or the pnetcdf library.

2 - Indeed, there are many advanced features of the netCDF4/HDF5 format, including compression, which are not available in the classic formats. However, parallel I/O is available with both netCDF4/HDF5 files, and with classic format files (with pnetcdf).

3 - I think we should install pnetcdf, and build PIO with it, since we are building PIO. Pnetcdf is easy to build and extremely well-maintained and well-behaved, so it will not cause any problems. And then, if anyone does want to use PIO to read/write classic format files, the capability will be there.

junwang-noaa commented 3 years ago

Ed/Ufuk, if PIO has the capability of netCDF4/HDF5 for output, why won't we directly use it in CMEPS and CDEPS, at least use it as an option so we can have a consistent data format in UFS and JEDI , this makes it easier for downstream developers to use the data too,

On Thu, Mar 25, 2021 at 6:38 AM Edward Hartnett @.***> wrote:

@junwang-noaa https://github.com/junwang-noaa to answer your questions: 1 - Yes, PIO also allows you to use netCDF4/HDF5 for output. PIO supports netCDF4/HDF5 files with the netcdf-c library, and netCDF classic format files with either the netcdf-c library, or the pnetcdf library.

2 - Indeed, there are many advanced features of the netCDF4/HDF5 format, including compression, which is not available in the classic formats. However, parallel I/O is available with both netCDF4/HDF5 files, and with classic format files (with pnetcdf).

3 - I think we should install pnetcdf, and build PIO with it, since we are building PIO. Pnetcdf is easy to build and extremely well-maintained and well-behaved, so it will not cause any problems. And then, if anyone does want to use PIO to read/write classic format files, the capability will be there.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/hpc-stack/issues/102#issuecomment-806542105, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI7D6TLFEC6X6T6ZPUMTSCTTFMHBDANCNFSM4T53CZ4Q .

edwardhartnett commented 3 years ago

Using PIO vs. netCDF does not affect the data format.

PIO gives enhanced capabilities with thousands of processors. It reads and writes netCDF files, so it's OK for some applications to use PIO, and some to use netCDF directly.

With some of my recent changes to PIO and netcdf-c, it's now possible to use the exact same C or Fortran code with both PIO and netCDF. (However, CMEPS and CDEPS do not do this, as they were developed some time ago.)

I'm giving an EIB talk on PIO on 4/7, more info can be found on our AMS poster: https://www.researchgate.net/publication/348170136_THE_PARALLELIO_PIO_CFORTRAN_LIBRARIES_FOR_SCALABLE_HPC_PERFORMANCE

junwang-noaa commented 3 years ago

I thought one is HDF5 and needs netcdf4/hdf5 for parallel reading/writing, the other is not HDF5 and needs Pnetcdf for parallel reading/writing, please let me know if it is not the case.

On Thu, Mar 25, 2021 at 9:04 AM Edward Hartnett @.***> wrote:

Using PIO vs. netCDF does not affect the data format.

PIO gives enhanced capabilities with thousands of processors. It reads and writes netCDF files, so it's OK for some applications to use PIO, and some to use netCDF directly.

With some of my recent changes to PIO and netcdf-c, it's now possible to use the exact same C or Fortran code with both PIO and netCDF. (However, CMEPS and CDEPS do not do this, as they were developed some time ago.)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/hpc-stack/issues/102#issuecomment-806705141, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI7D6TMVD2SM6UCCDFUIMRTTFMYGVANCNFSM4T53CZ4Q .

edwardhartnett commented 3 years ago

Yes, you are correct, netCDF/HDF5 files are HDF5 and can be read/written with parallel I/O via HDF5.

Classic format files can be read/written with parallel I/O with pnetcdf, or sequentially with netcdf-c.

PIO uses netcdf-c, HDF5, and pnetcdf, so it can do sequential and parallel I/O with classic format and with netCDF/HDF5 files.

As noted earlier, compression is only available with netCDF4/HDF5 files, it is not available for classic format files. So PIO allows for compression with netCDF4/HDF5 files, but not with classic format files.

aerorahul commented 3 years ago

So the next logical question is: Who is using netcdf classic format with PIO that requires pnetcdf within the realm of UFS and JEDI applications.

edwardhartnett commented 3 years ago

OK, now we've circled back to the beginning of this conversation, where we agreed that pnetcdf should be installed to support PIO with pnetcdf in case anyone needs it. ;-)

Are UFS and JEDI really 100% netCDF-4? If so, that's cool.

However, certainly there are many classic format netcdf files in the world of weather modeling, so it's not unreasonable to think classic format with parallel I/O might one day be desired.

junwang-noaa commented 3 years ago

From Ufuk results, the " pio_rearranger = box" also provides good results. The question is if we want to add another dependency of pnetcdf in the UFS/JEDI system. We are trying to limit the dependencies if not required. Also as you said, PIO has the capability to use netcdf4/hdf5 which has parallel io and compression feature already, so instead going back to pnetcdf, could we stay with " pio_rearranger = box"and then update the code to call PIO with netcdf4/hdf5?

On Thu, Mar 25, 2021 at 11:02 AM Edward Hartnett @.***> wrote:

OK, now we've circled back to the beginning of this conversation, where we agreed that pnetcdf should be installed to support PIO with pnetcdf in case anyone needs it. ;-)

Are UFS and JEDI really 100% netCDF-4? If so, that's cool.

However, certainly there are many classic format netcdf files in the world of weather modeling, so it's not unreasonable to think classic format with parallel I/O might one day be desired.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/hpc-stack/issues/102#issuecomment-806908345, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI7D6TITKBAIJBBAHW7KW4TTFNGAZANCNFSM4T53CZ4Q .

edwardhartnett commented 3 years ago

The rearranger has nothing to do with the format. So you can use the box rearranger with either format. Also adding pnetcdf would not affect your use.

If you object to pnetcdf, then we can leave it out until someone requests it.

uturuncoglu commented 3 years ago

@edwardhartnett @junwang-noaa i set "netcdf" as default in UFS application which were used by S2S. I know that CESM is using pnetcdf for all its components by default. The initial results seems close with "pnetcdf" vs "netcdf with box rearanger" but this is very short run and in the long and high resolution runs pnetcdf results can perform better. So, we need to have benchmark with different options, resolutions and model configurations. Once we have pio with pnetcdf support we could start to test it. Along with the pnetcdf installation, the model requires some changes,

junwang-noaa commented 3 years ago

@uturuncoglu, the question is instead of adding another dependency of pnetcdf to the system and making cmake changes in the code, can we directly use the more advanced PIO feature with netcdf4/HDF5, which has the parallel IO and the compression data?

On Thu, Mar 25, 2021 at 11:44 AM Ufuk Turunçoğlu @.***> wrote:

@edwardhartnett https://github.com/edwardhartnett @junwang-noaa https://github.com/junwang-noaa i set "netcdf" as default in UFS application which were used by S2S. I know that CESM is using pnetcdf for all its components by default. The initial results seems close with "pnetcdf" vs "netcdf with box rearanger" but this is very short run and in the long and high resolution runs pnetcdf results can perform better. So, we need to have benchmark with different options, resolutions and model configurations. Once we have pio with pnetcdf support we could start to test it. Along with the pnetcdf installation, the model requires some changes,

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/hpc-stack/issues/102#issuecomment-806976005, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI7D6TKOE5Y6RPSW36UWR6LTFNK6XANCNFSM4T53CZ4Q .

uturuncoglu commented 3 years ago

@junwang-noaa i am not sure about the compression but parallel IO can be used once we install PIO with pnetcdf support. @edwardhartnett What do we need to turn on compression with PIO? I have no idea about it. The branch in HAFS side has changes for component level PIO initialization and a set of PIO related namelist options can be configured via nems.configure for flexibility.

climbfuji commented 3 years ago

Let me note that a few years back, pnetcdf was several times faster than parallel netcdf4 in extreme scaling applications. I am not sure if this is still the case, but I do know that models like MPAS use pnetcdf for this reason, for example.

edwardhartnett commented 3 years ago

However, it was also the case that most uses of netCDF4 are using bad chunksizes. With good chunksizes, the gap in performance is much smaller.

climbfuji commented 3 years ago

Did we come to a conclusion? How was pio 2.5.2 installed on the NOAA RDHPC platforms? With pnetcdf support or without? If with, which pnetcdf version? I need to install PIO on Cheyenne and Gaea and want to follow what was done on hera etc. Thanks!

edwardhartnett commented 3 years ago

@uturuncoglu if you are using the classic PIO API, you can call the fortran function PIO_def_var_deflate() or the C function PIOc_def_var_deflate() when a variable is defined. If you are using netCDF integration, then use the nc_def_var_deflate() as usual in netCDF.

@climbfuji and @junwang-noaa I recommend (as is the point of this issue) that pnetcdf be installed as part of the PIO build, in order to make the full range of PIO features available.

uturuncoglu commented 3 years ago

@edwardhartnett thanks for the information. this is for compression. Right?

@climbfuji @junwang-noaa yes, having pnetcdf support will be great. BTW, I am waiting for UFS level CDEPS PR (I am not sure about your plan for it) and then I'll make required changes (including pnetcdf support) in the HAFS application and those will eventually sync with UFS model. Also, we need to have CMakeModule support for pnetcdf. I have a initial implementation in the following repo and branch (https://github.com/hafs-community/CMakeModules - feature/pio_fix_comp) but this is very basic implementation and I able to test it only on Orion (I have no access to other platform). So, you might want to use it as reference and extend it to cover other platforms.

edwardhartnett commented 3 years ago

See also https://github.com/NCAR/ParallelIO/blob/master/cmake/FindPnetCDF.cmake

aerorahul commented 3 years ago

And https://github.com/JCSDA/jedi-cmake/blob/develop/cmake/Modules/FindPnetCDF.cmake

uturuncoglu commented 3 years ago

@edwardhartnett @aerorahul thanks for pointers. I checked the NCAR one before and I think that it was too complex. I did not check the JEDI one. Anyway, we could use either of one to bring PnetCDF support to build. Any preference?

aerorahul commented 3 years ago

@uturuncoglu The JEDI one provides a complete array of the options along with interface targets. The usage is then reduced to for e.g.:

find_package(PnetCDF REQUIRED COMPONENTS C)
target_link_libraries(target_library PUBLIC|PRIVATE PnetCDF::PnetCDF_C)

That would be my preference to be added to the CMakeModules used in UFS-weather-model.

It is what MPAS model uses as well, if that makes any difference.

aerorahul commented 1 year ago

Is this going to be worked on?