NOAA-EMC / GSI

Gridpoint Statistical Interpolation
GNU Lesser General Public License v3.0
66 stars 150 forks source link

Upgrade to BUFR 12 #642

Open DavidHuber-NOAA opened 1 year ago

DavidHuber-NOAA commented 1 year ago

A new major version of BUFR is available (version 12) and will be the default version available in spack-stack. Though version 11.7.0 can be installed on top of existing stacks, an upgrade to version 12 should be pursued. BUFR 12 installs just the BUFR_4 library. Switching to this version simply requires updating src/gsi/CMAKELISTS.txt. Additionally, the ufbqcd subroutine, which is used by read_prepbufr.f90 here and here, takes an integer array for the virtual temperature flag (vtcd) in BUFR 12, as opposed to a floating point array in previous versions, and thus requires updates to these parameters in read_prepbufr.f90.

Some work was performed in #589 to test BUFR 12 which showed that there is some slow down using version 12.0.0 (20-40s (~5-10%) increase for the global_3dvar and global_4denvar test cases). This should be investigated to determine if the time difference is acceptable and, if not, work with @jbathegit and @jack-woollen to see if optimizations can be made to the library.

jack-woollen commented 1 year ago

@DavidHuber-NOAA I want to compile the gsi using bufr/12.0.0. There's a module for that in /apps/ops/para/libs/intel/19.1.3.304/bufr. The cmake for gsi on wcoss2 uses /apps/ops/prod/libs/intel/19.1.3.304/bufr. How can I tell it to use the para tree?

DavidHuber-NOAA commented 1 year ago

@jack-woollen The GSI looks for the following environmental variables for BUFR. I would suggest adding this to gsi_common.lua in place of the load(...bufr...) command as

prepend_path("PATH", "</path/to/bufr12>/bin", ":")
prepend_path("LD_LIBRARY_PATH", "</path/to/bufr12>/lib", ":")
prepend_path("DYLD_LIBRARY_PATH", "</path/to/bufr12>/lib", ":")
prepend_path("LD_LIBRARY_PATH", "</path/to/bufr12>/lib64", ":")
prepend_path("DYLD_LIBRARY_PATH", "</path/to/bufr12>/lib64", ":")
prepend_path("CPATH", "</path/to/bufr12>/include", ":")
prepend_path("CMAKE_PREFIX_PATH", "</path/to/bufr12>/.", ":")
prepend_path("PATH", "</path/to/bufr12>/bin", ":")
prepend_path("CMAKE_PREFIX_PATH", "</path/to/bufr12>/.", ":")
setenv("BUFR_LIB4", "</path/to/bufr12>/lib64/libbufr_4.so")

You will also need to change a few lines within the source code and one of the cmake files. Feel free to copy what I have in /scratch1/NCEPDEV/nems/David.Huber/GSI/gsi_spackstack_b12_n492/src/gsi/read_prepbufr.f90 -- look for vtcd and glcd /scratch1/NCEPDEV/nems/David.Huber/GSI/gsi_spackstack_b12_n492/src/gsi/CMakeLists.txt -- change bufr_d to bufr_4

jack-woollen commented 1 year ago

Thanks @DavidHuber-NOAA - works well. Good to know.

DavidHuber-NOAA commented 1 year ago

@jack-woollen I've moved this conversation over to this issue dealing with BUFR 12.

@RussTreadon-NOAA @aerorahul I copied /lfs/h2/emc/global/noscrub/Jack.Woollen/bufrtime/bufr_v12.0.0/NCEPLIBS-bufr over to Hera:/scratch1/NCEPDEV/nems/David.Huber/LIBS/BUFR/bufr-bufr_v12.0.0_fast and built it with Intel 2022 and installed it here: /scratch1/NCEPDEV/nems/David.Huber/LIBS/BUFR/bufr/12.0.0_fast. Next, I compiled the GSI with the spack-stack/1.4.1 libraries with the exception of bufr, where v12.0.0_fast was used instead (located here: /scratch1/NCEPDEV/nems/David.Huber/GSI/gsi_spackstack_b12_fast). I then built a copy of the GSI with spack-stack/1.4.1, including bufr/11.7.0 (located here: /scratch1/NCEPDEV/nems/David.Huber/GSI/gsi_spackstack). Finally, I ran regression tests between the two cases. All tests passed.

The regression tests have been updated and global_3dvar is no longer included in the test suite. Two new tests have been added for HAFS, both of which are impacted by the BUFR slowdown: hafs_3denvar_hybens and hafs_4denvar_glbens. I also included the rrfs_3denvar_glbens tests which also run the observer. The results of the regression tests show improvements in the global_4denvar test case, especially using a higher number of PEs (hiproc). Averaging the differences between all cases shows a 4.4s increase for hiproc tests and 22.6s for loproc tests.

My opinion is that since this seems to only affect the observer and thus won't scale up with more iterations, that this is an acceptable increase in runtime. Thoughts Russ, et al?

Runtimes

Test 11.7.0 loproc Time 12.0.0_fast loproc Time 11.7.0 hiproc 12.0.0_fast hiproc
global_4denvar 381.6 409.6 325.3 318.0
hafs_3denvar_hybens 318.2 330.3 235.3 252.1
hafs_4denvar_glbens 362.6 382.6 266.0 275.7
rrfs_3denvar_glbens 79.5 109.7 59.1 57.8
RussTreadon-NOAA commented 1 year ago

The wall time increases with bufr/12.0.0_fast are not trivial, especially for rrfs_3denvar_glbens. The hafs hiproc runs also show increased run time when using bufr/12.0.0_fast. Adding @hu5970 and @ShunLiu-NOAA for awareness.

gsi.x is not the only application built with bufr modules. How do other bufr dependent applications perform when moving to bufr/12.0.0?

jack-woollen commented 1 year ago

I agree with Russ about the timing. There is more to it than we have found yet. Another half to find in fact. I have a few ideas to try next.

jack-woollen commented 1 year ago

@DavidHuber-NOAA @jbathegit Well, I found about a half of the half of the difference left with an update in rdcmps. With this change added, the difference from the gsi observer control is reduced to +12-13s. Same comparison for running prepobs shows a difference of +7-8s. Maybe we're getting down to manageable territory. The updated code is in the same place on dogwood in /lfs/h2/emc/global/noscrub/Jack.Woollen/bufrtime/bufr_v12.0.0.

DavidHuber-NOAA commented 1 year ago

@jack-woollen I ran the regression tests between spack-stack/1.5.1 (bufr/11.7.0) and spack-stack/1.5.1 with the bufr 12 library installed here: /scratch1/NCEPDEV/global/Jack.Woollen/bufrtime/bufr_v12.0.0/build/path1. The global_4denvar timings have improved and in fact are a little faster than 11.7.0. The HAFS timings improved significantly for the 3denvar/hiproc case, but stayed about the same for all other cases. And the RRFS timings stayed about the same. This is definitely progress, though.

Runtimes

Test 11.7.0 loproc Time 12.0.0_fast loproc Time 11.7.0 hiproc 12.0.0_fast hiproc
global_4denvar 374.5 371.9 296.3 294.3
hafs_3denvar_hybens 289.0 308.0 237.8 213.6
hafs_4denvar_glbens 351.6 358.2 261.3 277.5
rrfs_3denvar_glbens 77.7 107.0 55.4 54.4
DavidHuber-NOAA commented 1 year ago

@jack-woollen It's become apparent that the HAFS tests have a lot of variability in runtimes, so perhaps we should not include them in the timing tests. I am going to run the global_4denvar and rrfs_3denvar_glbens tests with bufr/11.7.0 and 12.0.0_fast a few times and compare mean runtimes at low/high PE counts.

DavidHuber-NOAA commented 12 months ago

@jack-woollen I (finally) ran the tests I mentioned above which revealed that global_4denvar when run with your BUFR optimizations has nearly the same runtimes as version 11.7.0, which is great!

The rrfs_3denvar_glbens test is still showing a slowdown with bufr/12, however. Though it is interesting that there is a lot of variation in the RRFS runtimes and suggests a bug in the RRFS DA (it reminds me of this MPI bug within the RRFS code found during the Intel 2022 upgrade).

The runtimes for the tests are attached. BUFR_Runtimes.xlsx

jbathegit commented 12 months ago

@DavidHuber-NOAA @jbathegit Well, I found about a half of the half of the difference left with an update in rdcmps. With this change added, the difference from the gsi observer control is reduced to +12-13s. Same comparison for running prepobs shows a difference of +7-8s. Maybe we're getting down to manageable territory. The updated code is in the same place on dogwood in /lfs/h2/emc/global/noscrub/Jack.Woollen/bufrtime/bufr_v12.0.0.

@jack-woollen I see your update to rdcmps() Thanks for coming up with that fix, and I can work to pull it over to the develop baseline. But is there anything else you want to include as well, or anything else you're still working on with @DavidHuber-NOAA that's related to these timing problems?

Sorry if I missed something, but there's been a lot of traffic and discussion on this thread and I'm just trying to understand any net changes that we need to bring over now to the library baseline. Note that I've already pulled over and merged your previous upb8() fix.

jack-woollen commented 12 months ago

@jbathegit There is one more older change that didn't get into 12.0.0 which is in test/test_ufbrw.F90. It is in the working set on hera in /scratch1/NCEPDEV/global/Jack.Woollen/bufrtime/bufr_v12.0.0/NCEPLIBS-bufr, with the other changes. Thanks.

jbathegit commented 11 months ago

Thanks @jack-woollen but is there any way you could copy that test/test_ufbrw.F90 change over to somewhere on dogwood (or cactus or acorn)? I don't have an account on hera.

jack-woollen commented 11 months ago

/lfs/h2/emc/global/noscrub/Jack.Woollen/bufrtime/bufr_v12.0.0/NCEPLIBS-bufr

RussTreadon-NOAA commented 11 months ago

Any updates on this issue?

DavidHuber-NOAA commented 11 months ago

It looks like there is a PR to make the optimizations to the BUFR library (NOAA-EMC/NCEPLIBS-bufr#543). Once merged and a new tag is created, we can request that version be installed in a future spack-stack release. And then we can go about implementing this in the GSI. I would guess this would be near the beginning of the second quarter of next year.

RussTreadon-NOAA commented 11 months ago

Thank you @DavidHuber-NOAA for the update. It's good to see that we may be able to move forward with this issue in the coming year.

CatherineThomas-NOAA commented 7 months ago

@DavidHuber-NOAA I see that https://github.com/NOAA-EMC/NCEPLIBS-bufr/pull/543 has been merged but I don't see a new tag yet. Do you know of any plans for a new tagged version?

DavidHuber-NOAA commented 7 months ago

@jbathegit @AlexanderRichert-NOAA Do you know when a new tagged version for BUFR is expected?

jbathegit commented 7 months ago

My plan is to release a new version 12.1.0 in late May or early June.

DavidHuber-NOAA commented 7 months ago

Thanks @jbathegit!

RussTreadon-NOAA commented 6 months ago

@DavidHuber-NOAA : what is the status of this issue? Are we waiting for bufr/12.1.0?

DavidHuber-NOAA commented 6 months ago

@RussTreadon-NOAA Yes, we are waiting on that version. When it is released, I will test it then ask for it to be installed into spack-stack 1.6.0. This will hopefully be done next month.

RussTreadon-NOAA commented 6 months ago

Thank you @DavidHuber-NOAA for the update. Good to hear that this issue is still on track. We just need to wait for bufr/12.1.0.

@CatherineThomas-NOAA : I will add this issue to the GFS v17 milestone for tracking purposes.

DavidHuber-NOAA commented 4 months ago

BUFR 12.1.0 was released yesterday. A request to include it in the next version of spack-stack and have it installed on WCOSS2 is open https://github.com/JCSDA/spack-stack/issues/1194.

RussTreadon-NOAA commented 2 months ago

@DavidHuber-NOAA , what is the status of this issue?

DavidHuber-NOAA commented 2 months ago

BUFR 12.1.0 is being rolled out with spack-stack 1.8.0. The release candidate has been installed on Hera and official installations will be rolled out soon. I will be performing the upgrade over the next 4 weeks (hopefully sooner).

RussTreadon-NOAA commented 2 months ago

Excellent! Thank you @DavidHuber-NOAA for your diligence in upgrading GSI to bufr/12.