NOAA-EMC / NCEPLIBS

Top level repo containing submodules for NCEPLIBS and associated dependencies for superproject builds
Other
43 stars 18 forks source link

Remove gfsio library from NCEPLIBS #123

Open edwardhartnett opened 4 years ago

edwardhartnett commented 4 years ago

Should the gfsio library be removed from NCEPLIBS?

Hang-Lei-NOAA commented 4 years ago

George V: need board survey; post also use it, we may have more entry points for this lib.

edwardhartnett commented 4 years ago

George will take a stab at removing gfsio from post and get back to us.

GeorgeVandenberghe-NOAA commented 4 years ago

it's going to take more than commenting out GFSIO stuff in ncep_post with a sed script. gfsio also defines a data API and has modules with the types defined so stripping this stuff with sed is beyond my sed ability.

However it looks like all of the GFSIO stuff is confined to a few high level routines which are not called with current workflows so we can get rid of THEM and then all of the gfsio stuff is no longer needed. Working on that.

gfsio is an I/O option for the NCEP GFS so any codes (Global ensembles and CFS stuff for example) that have a GFS dynamic core will also need this stripping done. It is probably also a high level branch into a large contiguous chunk of gfsio code which can be removed.

GeorgeVandenberghe-NOAA commented 4 years ago

Took about an hour to clean it all out of the old ncep_post that I use for testing NCEPLIBS builds. Subroutine initpost_gfs could be completely removed from the makefile. Then the call to it in gfspost.f needed to be removed and the use of the gfsio module removed from that source I also needed to remove gfsio references from wrfpost.f Once this was done, the post built without a gfsio library or modules. Overall not difficult and the next step is to check out the current emc post and perform the same steps.

gfsio itself was an early attempt to standardize output formats and grids . Instead of spectral coefficients, state was stored on gaussian grids which were easier for post processors to deal with and it was intended any GFS code would present and use this format to the user community. It fell into disfavor because the files were substantially larger than GFS sigma files and was replaced by a back step to the sigio and sfcio formats and adaptation of post processors to handle these. Then the sfcio and sigio formats were replaced by the nemsio API which persisted until 2020 and replacement of nemsio with netcdf. Nemsio I/O was parallel and netcdf was not but netcdf was more standard and had extensive compression support to reduce state footprint.

gfsio is dead at NCEP and is only needed because codes that have sections that compile with it but never branch to it. sigio, sfcio and nemsio remain extensively used. These libraries can be removed when GFS based codes (currently ensembles and CFS) are removed from production. THis is expected to be complete in about four yerars. The libraries will have to be retained for the research community until archives with that format are no longer used. The lifetime of that data is probably 15 years

GeorgeVandenberghe-NOAA commented 4 years ago

It should be noted that parallel compression support was (rapidly and successfully) developed for netcdf in response to critical performance requirements of NCEP codes.

edwardhartnett commented 4 years ago

@GeorgeVandenberghe-NOAA my conclusion reading your comment is that you were successful in your prototype code of getting post to build and test (if it tests) OK without gfsio? So we can mark gfsio for deprecation?

GeorgeVandenberghe-NOAA commented 4 years ago

Not yet. We either have to do this for every GFS based code, or we have to add it to the source trees of those codes and modify their builds. I am leaning more towards the second approach. "Modification" for the GFS model means adding the gfsio object to the makefile and satisfying the externals from the resultant .o and .mod files and making sure these build before the things in the source that need them. It is not at all hard since GFS based things used a flat directory structure for their source. NEMS gfs had a more complicated build and it may be easier to build the library as a new dependency in their make script.

Our ensembles still depend on the GFS dynamic core and use GFS builds.

On Tue, Oct 20, 2020 at 9:26 AM Edward Hartnett notifications@github.com wrote:

@GeorgeVandenberghe-NOAA https://github.com/GeorgeVandenberghe-NOAA my conclusion reading your comment is that you were successful in your prototype code of getting post to build and test (if it tests) OK without gfsio? So we can mark gfsio for deprecation?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/NCEPLIBS/issues/123#issuecomment-712848664, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FWEQWQOQSXIGWRBLJDSLWFYNANCNFSM4R6IWRTA .

--

George W Vandenberghe

IMSG at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

George.Vandenberghe@noaa.gov

301-683-3769(work) 3017751547(cell)

edwardhartnett commented 4 years ago

Well the goal is to reduce complexity. If we are adding the code to multiple projects, that sounds like we are increasing complexity. If this code is used in multiple places, and will be needed in the future, let's just keep it in this library and add tests.

GeorgeVandenberghe-NOAA commented 4 years ago

The key is, it is dead code. The sections of the GFS and NCEP_POST and chgres that use gfsio are never actually executed. So we just need the API to be available because it is too tedious to clean it out of the codes that use it. Because the library is not executed and the setions of code that use it are themselves never modified, maintanance is a non issue and our cost is just tracking and rebuilding it on new platforms from time to time.

The sigio and sfcio APIs are still used by GFS things and those sections of code are executed. Nemsio also has a sigio dependency (don't ask why😡 but it drove the only sigio bugfix we had to do in fifteen years!)

Because of that nemsio dependency requirement, we are stuck with sigio for the forseeable future. I think gfsio is a unique truly dead case we can get out of library space and into the builds of applications which are themselves to be sunsetted within five years.

On Tue, Oct 20, 2020 at 9:45 AM Edward Hartnett notifications@github.com wrote:

Well the goal is to reduce complexity. If we are adding the code to multiple projects, that sounds like we are increasing complexity. If this code is used in multiple places, and will be needed in the future, let's just keep it in this library and add tests.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NOAA-EMC/NCEPLIBS/issues/123#issuecomment-712861196, or unsubscribe https://github.com/notifications/unsubscribe-auth/ANDS4FRNW3V3AHWT635MD4LSLWH5XANCNFSM4R6IWRTA .

--

George W Vandenberghe

IMSG at NOAA/NWS/NCEP/EMC

5830 University Research Ct., Rm. 2141

College Park, MD 20740

George.Vandenberghe@noaa.gov

301-683-3769(work) 3017751547(cell)

edwardhartnett commented 4 years ago

One person's tedium is another person's daily work. ;-) So we should not be preserving the API at all, we should be removing this completely. I'm happy to jump in and help.

Can you list here every place it is used? Or create an issue in NCEP_POST and GFS to remove this code?