JCSDA / spack

A flexible package manager that supports multiple versions, configurations, platforms, and compilers.
https://spack.io
Other
0 stars 14 forks source link

CRTM version and coefficient file information #131

Closed BenjaminTJohnson closed 2 years ago

BenjaminTJohnson commented 2 years ago

Summary

This issue to help guide the addition and use of the CRTM releases from JCSDA.

There are 4 primary "releases" of the CRTM that requires support. The primary difference between _emc and _jedi is in the build system and binary file handling. _emc requires cmake only, where _jedi* requires ecbuild, and the _emc releases use git-LFS to directly obtain the coefficient files when cloning the repository. I also maintain an FTP server that has tarballs (duplicates of the git-LFS crtm/fix/ structure).

Here are the branch names, available from https://github.com/JCSDA/crtm.git

release/REL-2.3.0_emc
release/REL-2.4.0_emc

release/crtm_jedi               (v2.3.0 code base, with some modifications, details in Description section) 
release/crtm_jedi_v2.4.0

Note: We chose to use release/ branches (rather than relying on tags alone) to enable post-release support updates to build system / data sources, or to fix coefficient coefficient files, etc.

Therefore, selecting a specific hash in spack is probably not ideal -- but rather one should be checking out the "tip" of the release branch of interest. I recognize this "release branch" approach complicates the git history, but it greatly simplifies support. Open to recommendations.

There are also some differences between the coefficient packages (i.e., the 'fix/' directory that contains all of the little endian and big endian files, along with netcdf counterparts where available). The _jedi requirements are more lightweight than _emc. JEDI/UFO wanted a minimal set of coefficients -- little endian only, and a subset of instruments. The EMC fix/ is the the entirety of big_endian, little_endian, and netcdf files.

Rationale

No response

Description

Here's the details of the differences between the 4 "releases" that we're interested in supporting in spack:

REL-2.3.0_emc (left) vs. crtm_jedi (right), filenames that are different:

                              > /autogen.sh
                              > /ChangeLog.txt
                              > /CI/buildspec_clang.yml
                              > /CI/cdash-url.sh
                              > /CI/CMakeLists.txt
                              > /cmake/cdash-integration.cmake
                              > /cmake/compiler_flags_Cray_Fortran.cmake
                              > /cmake/compiler_flags_GNU_Fortran.cmake
                              > /cmake/compiler_flags_Intel_Fortran.cmake
                              > /cmake/compiler_flags_XL_Fortran.cmake
                              > /cmake/crtm_compiler_flags.cmake
                              > /cmake/CTestCustom.ctest.in
/cmake/PackageConfig.cmake.in <                
                              >  /config_bld_test.crtm
/COPYING                      <
                              > /configure
                              > /configure.ac
                              > /crtm-config.cmake.in
                              > /crtm-import.cmake.in
/DIFFLOG.md                   <
                              > /CTestConfig.cmake
                              > /install-sh
                              > /libsrc/CRTM_Module.fpp
                              > /libsrc/make.dependencies
                              > /libsrc/Makefile.in
                              > /libsrc/make.filelist
                              > /libsrc/make.rules
                              > /libsrc/RSS_Emissivity_Model.f90
/LICENSE.md                   | /LICENSE
/README.md                    < /README
                              > /Makefile.in
/VERSION                      | /VERSION.cmake
                              > /NOTES

These differences are primarily related to build system support, with some legacy support for autotools build.

crtm_jedi also contains an entire test/ directory that was not present in the REL-2.3.0_emc release. REL-2.3.0_emc contains the fix/ directory (obtained via git-LFS on checkout), which crtm_jedi does not.

Comparing ctm_jedi_v2.4.0 and REL_2.4.0_emc is more challenging. with crtm_jedi_v2.4.0, we released it in the same directory structure that we use for development, rather than a "build release", which REL-2.3.0_emc, REL-2.4.0_emc, and crtm_jedi were released as. In the future, all releases of CRTM will use the same structure as crtm_jedi_v2.4.0. Here's the 2 level tree for crtm_jedi_v2.4.0.

.
├── cmake
│   └── Modules
├── configuration
├── scripts
│   └── shell
├── src
│   ├── Ancillary
│   ├── AntennaCorrection
│   ├── AtmAbsorption
│   ├── AtmOptics
│   ├── AtmScatter
│   ├── Atmosphere
│   ├── Build
│   ├── CRTM_Utility
│   ├── ChannelInfo
│   ├── Coefficients
│   ├── GeometryInfo
│   ├── InstrumentInfo
│   ├── Interpolation
│   ├── NLTE
│   ├── Options
│   ├── RTSolution
│   ├── SensorInfo
│   ├── SfcOptics
│   ├── Source_Functions
│   ├── Statistics
│   ├── Surface
│   ├── TauProd
│   ├── TauRegress
│   ├── Test_Utility
│   ├── User_Code
│   ├── Utility
│   └── Zeeman
├── test
│   ├── cmake
│   ├── mains
│   ├── test_build
│   └── testinput
└── util
    └── checkendian

where the source codes are distributed under these various src/* directories.

vs. the tree of REL-2.4.0_emc

.
├── cmake
├── fix
│   ├── ACCoeff
│   ├── AerosolCoeff
│   ├── CloudCoeff
│   ├── EmisCoeff
│   ├── SpcCoeff
│   └── TauCoeff
├── libsrc
└── test
    ├── cmake
    ├── mains
    ├── test_build
    └── testinput

where all source codes are flat in libsrc.

Now, to the code differences themselves: REL-2.3.0_emc vs. crtm_jedi has lots of code differences, however, most of it is related to (a) documentation material that was removed for crtm_jedi, internal module ID information that was added by SVN (and removed in crtm_jedi).

However, there are some critical code differences relating to an openMP implementation in crtm_jedi, but numerically the results are identical within at least single precision (1e-6). There's also a lot of documentation differences between the codes. I have an extensive "diff", and can provide that if necessary.

The code differences between REL-2.4.0_emc and crtm_jedi_v2.4.0 are minimal. The primary differences is this addition to the Reflection_Correction_Module.f90 to deal with small transmittance values causing an underflow in certain situations:

diff -b `find ./REL-2.4.0_emc_public/ -name "Reflection_Correction_Module.f90" -type f -print -quit` `find ./CRTM_v2.4.0_for_jedi/ -name "Reflection_Correction_Module.f90" -type f -print -quit
> `
33d32
<   USE CRTM_Parameters, ONLY: LIMIT_EXP   
105c104
<     Transmittance_in, &  ! Input
---
>     Transmittance, &  ! Input
114c113
<     REAL(fp)              , INTENT(IN)     :: Transmittance_in
---
>     REAL(fp)              , INTENT(IN)     :: Transmittance
121d119
<     REAL(fp) :: Transmittance
123,125d120
<     ! Apply limits per Emily Liu (5/2022) -- this only applies to the REL_2.4.0_emc release branch, and has not 
<     ! been merged into v2.4.1 or later pending a more detailed assessment --BTJ
<     Transmittance = MAX(Transmittance_in, EXP(-LIMIT_EXP)) 

Additional information

No response

General information

kgerheiser commented 2 years ago

Just saw this. Thanks for the info. I'm still going through it all, but some initial comments.

We have a crtm package here: https://github.com/NOAA-EMC/spack/blob/d125b1f748c06f714316184ef1e629d394215729/var/spack/repos/builtin/packages/crtm/package.py

And a crtm-fix package that pulls from FTP here: https://github.com/NOAA-EMC/spack/blob/jcsda_emc_spack_stack/var/spack/repos/jcsda-emc/packages/crtm-fix/package.py

crtm has a variant +fix that decides whether to install crtm-fix.

It was easier to split it up as a separate package with its own hash and version, but perhaps they should be combined to just use the git-lfs binaries. It was over the quota once which wasn't a problem with FTP.

I had a problem building the JEDI 2.4.0 release with Ecbuild a while back, but maybe that has been fixed?

https://github.com/NOAA-EMC/spack/blob/d125b1f748c06f714316184ef1e629d394215729/var/spack/repos/builtin/packages/crtm/package.py#L28-L30

Therefore, selecting a specific hash in spack is probably not ideal

What we have now specifies the hash that was at the tip of the branch when I created the package. We could set it to do it by branch, but would it be possible to offer point releases based off of that branch? I think it would be clearer that way.

We also have this workaround in crtm-fix that goes back to earlier this year. I don't really get what the issue was, but the big endian amsua_metop-c.SpcCoeff.bin was mixed up with the little endian version or vice-versa, so we perform some hack before installing.

https://github.com/NOAA-EMC/spack/blob/d125b1f748c06f714316184ef1e629d394215729/var/spack/repos/jcsda-emc/packages/crtm-fix/package.py#L52-L69

BenjaminTJohnson commented 2 years ago

@kgerheiser see #155 for updated tags and coordination with Dom.