travissluka / soca-tutorial

temporary testbed for a soca tutorial, before it gets absorbed into the correct JEDI repos later
Apache License 2.0
2 stars 2 forks source link

Running SOCA on local machines & other unsupported HPCs #1

Closed gmao-cda closed 7 months ago

gmao-cda commented 8 months ago

Hi Travis,

JoJo (12 cores+64GB memory) finished updating.

I'm following this tutorial (https://spack-stack.readthedocs.io/en/1.6.0/NewSiteConfigs.html#prerequisites-ubuntu-one-off) to build spack & template unified-dev on my local (os: Kubuntu 22.04 LTS, 6.5.0-21-generic #21~22.04.1-Ubuntu, gcc@11.4.0).

Problem

It seems the installation prerequisites 6.2.2. Prerequisites: Ubuntu (one-off) (https://spack-stack.readthedocs.io/en/1.6.0/NewSiteConfigs.html#prerequisites-ubuntu-one-off), Step 1 "Install basic OS packages as root", Section "Misc" is not complete.

I got two errors

Problem related to autopoint:

Can't exec "autopoint": No such file or directory at /usr/share/autoconf/Autom4te/FileUtils.pm.

Problem related to gettext:

/bin/bash: line 1: xgettext: command not found
make[2]: *** [Makefile:674: cxpm.po] Error 127
make[2]: *** Waiting for unfinished jobs....
mv -f .deps/cxpm.Tpo .deps/cxpm.Po
make[2]: Leaving directory '/home/cda/fast/pkg/spack-stack/cache/build_stage/spack-stage-libxpm-3.5.17-psnzjhv7rkydtqscil7kgk3hsu27i4j2/spack-src/cxpm'
make[1]: *** [Makefile:491: all-recursive] Error 1
make[1]: Leaving directory '/home/cda/fast/pkg/spack-stack/cache/build_stage/spack-stage-libxpm-3.5.17-psnzjhv7rkydtqscil7kgk3hsu27i4j2/spack-src'
make: *** [Makefile:400: all] Error 2

Solution

I Need to install the additional package

sudo apt-get install autopoint
sudo apt-get install gettext

Consequence

Successfully passed these previously-failed sections, though the installation has't finished yet. I will update when I finish.

Comment

  1. I will probably try to disable some components shown in spack.yml. So many components that I don't need to use.
  2. I will gradually turn this thread to a wiki-like page so that users using local machines & unsupported HPCs can follow this documentation to build their SOCA test cases.
gmao-cda commented 8 months ago

Another installation error related to crtm@2.4.0.1 (Solved)

I believe this error of crtm is unrelated to my machine, but related to to-be-built package itself. Error info is:

[+] /home/cda/fast/pkg/spack-stack/envs/unified-env.jojoSOCA/install/gcc/11.4.0/ncio-1.1.2-pvmcajg
==> Installing crtm-2.4.0.1-fg456hia2nbwycn6wkgq6kjvk7hf7sok [265/399]
==> No binary for crtm-2.4.0.1-fg456hia2nbwycn6wkgq6kjvk7hf7sok found: installing from source
==> Warning: Fetching from mirror without a checksum!
  This package is normally checked out from a version control system, but it has been archived on a spack mirror.  This means we cannot know a checksum for the tarball in advance. Be sure that your connection to this mirror is secure!

gzip: stdin: invalid compressed data--crc error

gzip: stdin: invalid compressed data--length error
/usr/bin/tar: Unexpected EOF in archive
/usr/bin/tar: Unexpected EOF in archive
/usr/bin/tar: Error is not recoverable: exiting now
==> Using cached archive: /home/cda/fast/pkg/spack-stack/cache/source_cache/_source-cache/git//JCSDA/crtm.git/7ecad4866c400d7d0db1413348ee225cfa99ff36.tar.gz
==> Error: ProcessError: Command exited with status 2:
    '/usr/bin/tar' '-oxf' '/home/cda/fast/pkg/spack-stack/cache/build_stage/spack-stage-crtm-2.4.0.1-fg456hia2nbwycn6wkgq6kjvk7hf7sok/7ecad4866c400d7d0db1413348ee225cfa99ff36.tar.gz'
==> Warning: Skipping build of gsi-env-1.0.0-372rrkrny3wsxhvgrup7u3k7zd43oghj since crtm-2.4.0.1-fg456hia2nbwycn6wkgq6kjvk7hf7sok failed
==> Warning: Skipping build of upp-10.0.10-2sj4tx2v3cpocqqrtkgfz3taoave6yjh since crtm-2.4.0.1-fg456hia2nbwycn6wkgq6kjvk7hf7sok failed
==> Warning: Skipping build of ufs-srw-app-env-1.0.0-5hvlqzo4c5ckntcupnk5gl33m5zx5w7h since upp-10.0.10-2sj4tx2v3cpocqqrtkgfz3taoave6yjh failed
==> Warning: Skipping build of global-workflow-env-1.0.0-babe3raht6rts7hpd235rkak5cvnjbkt since crtm-2.4.0.1-fg456hia2nbwycn6wkgq6kjvk7hf7sok failed
==> Warning: Skipping build of ufs-weather-model-env-1.0.0-xyjxfabdvbiz3cj4vdcbzxxxzjghmsal since crtm-2.4.0.1-fg456hia2nbwycn6wkgq6kjvk7hf7sok failed
==> Error: Terminating after first install failure: ProcessError: Command exited with status 2:
    '/usr/bin/tar' '-oxf' '/home/cda/fast/pkg/spack-stack/cache/build_stage/spack-stage-crtm-2.4.0.1-fg456hia2nbwycn6wkgq6kjvk7hf7sok/7ecad4866c400d7d0db1413348ee225cfa99ff36.tar.gz'

The key error message (last line) shows tar fails for the file /home/cda/fast/pkg/spack-stack/cache/build_stage/spack-stage-crtm-2.4.0.1-fg456hia2nbwycn6wkgq6kjvk7hf7sok/7ecad4866c400d7d0db1413348ee225cfa99ff36.tar.gz

Debug info

I tried to manually run the last tar command, with -v option, so I know what happens

cda@jojo:~/Desktop$ tar -xvzf 7ecad4866c400d7d0db1413348ee225cfa99ff36.tar.gz
crtm/
crtm/.gitattributes
crtm/.gitignore
crtm/CMakeLists.txt
crtm/COPYING
crtm/LICENSE.md
crtm/NOTES.md
crtm/README.md
crtm/VERSION
crtm/crtm_release_notes.txt
crtm/.github/
crtm/.github/main.yaml
crtm/cmake/
crtm/cmake/CTestCustom.ctest.in
crtm/cmake/FindNetCDF.cmake
crtm/cmake/PackageConfig.cmake.in
crtm/cmake/compiler_flags_GNU_Fortran.cmake
crtm/cmake/compiler_flags_Intel_Fortran.cmake
crtm/cmake/crtm_compiler_flags.cmake
crtm/fix/

(...Cheng Da's comment: skip showing all here...)

crtm/fix/TauCoeff/ODPS/Little_Endian/iasi616_metop-c.TauCoeff.bin
crtm/fix/TauCoeff/ODPS/Little_Endian/iasiB1_metop-a.TauCoeff.bin

gzip: stdin: invalid compressed data--crc error

gzip: stdin: invalid compressed data--length error
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now

Seems it's the package /home/cda/fast/pkg/spack-stack/cache/build_stage/spack-stage-crtm-2.4.0.1-fg456hia2nbwycn6wkgq6kjvk7hf7sok/7ecad4866c400d7d0db1413348ee225cfa99ff36.tar.gz has the problem.

Traceback & Possible reasons for this failure

Required by following packages

repos/spack-stack/packages/ufs-weather-model-env/package.py:    depends_on("crtm@2.4.0.1", type="run")
repos/spack-stack/packages/ufs-srw-app-env/package.py:    depends_on("crtm@2.4.0.1")
repos/spack-stack/packages/global-workflow-env/package.py:    depends_on("crtm@2.4.0.1")
repos/spack-stack/packages/jedi-base-env/package.py:    # depends_on("crtm@v2.4.1-jedi", type="run")
repos/spack-stack/packages/gmao-swell-env/package.py:    depends_on("crtm@v2.4-jedi.2", type="run")
repos/spack-stack/packages/gsi-env/package.py:    depends_on("crtm@2.4.0.1")

Checking repos/builtin/packages/crtm/package.py shows that crtm@2.4.0.1 corresponds to version("2.4.0.1", tag="v2.4.0_emc.3", commit="7ecad4866c400d7d0db1413348ee225cfa99ff36") Let me see what happens if I directly download from github Update 1:

Update 2: temporary fix by neglecting apps related to crtm@2.4.0.1: created a customerized template soca-dev, with its spack.yaml as

# spack-stack hash: 06de425
# spack hash: 03ecae6b77
spack:
  concretizer:
    unify: when_possible

  view: false
  include:
  - site
  - common

  definitions:
  - compilers: ['%gcc']
  - packages:
      #- ewok-env +ecflow +cylc
      #- geos-gcm-env
      #- global-workflow-env
      #- gmao-swell-env
      #- gsi-env
      #- jedi-fv3-env
      #- jedi-geos-env
      #- jedi-mpas-env
      #- jedi-neptune-env
    - jedi-tools-env
      #- jedi-ufs-env
      #- jedi-um-env
      #- nceplibs-env
    - soca-env
      #- ufs-srw-app-env
      #- ufs-utils-env
      #- ufs-weather-model-env
      #- upp-env
      #- ww3-env

      # Various fms tags (list all to avoid duplicate packages)
    - fms@release-jcsda
    - fms@2023.04

      # Various crtm tags (list all to avoid duplicate packages)
      #- crtm@2.4.0.1
      #- crtm@v2.4.1-jedi

      # MADIS for WCOSS2 decoders.
    - madis@4.5

  specs:
  - matrix:
    - [$packages]
    - [$compilers]
    exclude:
        # jedi-tools doesn't build with Intel
    - jedi-tools-env%intel
  packages:
    all:
      compiler: [gcc@11.4.0]
      providers:
        mpi: [mpich@4.1.1]
    fontconfig:
      variants: +pic
    pixman:
      variants: +pic
    cairo:
      variants: +pic

Now spack finishes installation. I hope SOCA does not use radiance so that we don't need CRTM. Let me see if this simplified works for your tutorial

gmao-cda commented 8 months ago

SOCA build & ctest

Finish building SOCA executables, which is confirmed by checking the ../bin dir

cda@jojo:~/fast/work/soca-tutorial/jedi-bundle/build/soca$ ls ../bin/
check_ioda_nc.py                 oops_plot                           saber_doc_overview.sh      soca_enspert.x                   soca_setcorscales.x
compare-odbs                     oops_plot.py                        saber_plot                 soca_ensrecenter.x               soca_sqrtvertloc.x
compare.py                       oops_test_wrapper.sh                soca_addincrement.x        soca_error_covariance_toolbox.x  soca_var.x
cpplint.py                       plot                                soca_checkpoint_model.x    soca_forecast.x                  test_wrapper.sh
ioda_compare_odc_with_netcdf.py  plot.py                             soca_convertincrement.x    soca_gridgen.x                   ufo_cpplint.py
ioda_compare.sh                  refactor-yaml.py                    soca_convertstate.x        soca_hofx3d.x                    vader_cpplint.py
ioda_cpplint.py                  retrieval_upgrader.py               soca_diffstates.x          soca_hofx.x
oops_compare.py                  saber_compare_dirac_diagnostics.py  soca_enshofx.x             soca_hybridgain.x
oops_cpplint.py                  saber_cpplint.py                    soca_ensmeanandvariance.x  soca_letkf.x

However, I got 73 failed tests when running ctest inside build/soca

The following tests FAILED:
      2 - test_soca_gridgen (Failed)
      3 - test_soca_geometry (Failed)
      4 - test_soca_geometry_iterator_2d (Failed)
      5 - test_soca_geometry_iterator_3d (Failed)
      6 - test_soca_state (Failed)
      7 - test_soca_increment (Failed)
      8 - test_soca_model (Failed)
      9 - test_soca_modelaux (Failed)
     10 - test_soca_getvalues (Failed)
     11 - test_soca_errorcovariance (Failed)
     12 - test_soca_linearmodel (Failed)
     13 - test_soca_varchange_ana2model (Failed)
     14 - test_soca_varchange_balance (Failed)
     15 - test_soca_varchange_balance_TSSSH (Failed)
     16 - test_soca_varchange_bkgerrfilt (Failed)
     17 - test_soca_varchange_horizfilt (Failed)
     18 - test_soca_varchange_bkgerrsoca (Failed)
     19 - test_soca_varchange_bkgerrsoca_stddev (Failed)
     20 - test_soca_varchange_bkgerrgodas (Failed)
     21 - test_soca_varchange_vertconv (Failed)
     22 - test_soca_obslocalization (Failed)
     23 - test_soca_obslocalization_vertical (Failed)
     24 - test_soca_obslocalizations (Failed)
     25 - test_soca_forecast_identity (Failed)
     26 - test_soca_forecast_mom6 (Failed)
     27 - test_soca_forecast_pseudo (Failed)
     28 - test_soca_forecast_mom6_ens1 (Failed)
     29 - test_soca_forecast_mom6_ens2 (Failed)
     30 - test_soca_forecast_mom6_ens3 (Failed)
     31 - test_soca_static_socaerror_init (Failed)
     32 - test_soca_static_socaerrorlowres_init (Failed)
     33 - test_soca_setcorscales (Failed)
     35 - test_soca_parameters_bump_cor_nicas_scales (Failed)
     36 - test_soca_parameters_bump_loc (Failed)
     38 - test_soca_parameters_diffusion_hz (Failed)
     39 - test_soca_parameters_diffusion_vt (Failed)
     40 - test_soca_enspert (Failed)
     41 - test_soca_convertstate (Failed)
     42 - test_soca_convertstate_changevar (Failed)
     43 - test_soca_ensmeanandvariance (Failed)
     44 - test_soca_parametric_stddev (Failed)
     45 - test_soca_ensrecenter (Failed)
     46 - test_soca_hybridgain (Failed)
     47 - test_soca_sqrtvertloc (Failed)
     48 - test_soca_diffstates (Failed)
     49 - test_soca_dirac_soca_cor_nicas_scales (Failed)
     50 - test_soca_dirac_soca_cov (Failed)
     51 - test_soca_dirac_socahyb_cov (Failed)
     52 - test_soca_dirac_horizfilt (Failed)
     53 - test_soca_dirac_soca_mask (Failed)
     54 - test_soca_dirac_soca_nomask (Failed)
     55 - test_soca_dirac_diffusion (Failed)
     56 - test_soca_makeobs (Failed)
     57 - test_soca_hofx_3d (Failed)
     58 - test_soca_hofx_4d (Failed)
     59 - test_soca_hofx_4d_pseudo (Failed)
     60 - test_soca_3dvar_soca (Failed)
     61 - test_soca_3dvarbump (Failed)
     62 - test_soca_3dvar_diffusion (Failed)
     63 - test_soca_3dvar_godas (Failed)
     64 - test_soca_3dvarlowres_soca (Failed)
     65 - test_soca_3dvarfgat (Failed)
     66 - test_soca_3dvarfgat_pseudo (Failed)
     67 - test_soca_3dhyb (Failed)
     68 - test_soca_3dhybfgat (Failed)
     69 - test_soca_4denvar (Failed)
     70 - test_soca_4dhybenvar (Failed)
     71 - test_soca_letkf (Failed)
     72 - test_soca_letkf_split_observer (Failed)
     73 - test_soca_letkf_split_solver (Failed)
     74 - test_soca_addincrement (Failed)
     75 - test_soca_convertincrement (Failed)
     76 - test_soca_checkpointmodel (Failed)

Will try to debug tomorrow

travissluka commented 8 months ago

@gmao-cda Sorry for ignoring you, i didn't realize there were issues open until just now ! (I apparently need to check my github email notification settings).

Regarding the building of spack-stack

  1. we haven't switched to 1.6 yet, and I believe 1.6 has not been finalized yet, so you're best off using the version listed in the tutorial here (1.5.1). That said, if you have it working, then carry on.
  2. yes, modifying spack.yaml to disable all the stuff I know I don't need, including CRTM, is exactly the same as I do on my own machine
  3. If you have issues with other packages that need to be installed for ubuntu, please open an issue on the spack-stack repo, (they're very responsive)

Did you ever get the ctests passing? Usually when every single test like that fails its because git-lfs was not setup correctly the first time so the binary files were not downloaded. Try running ctest -R gridgen -V to see what its' complaining about

gmao-cda commented 8 months ago

Thank you for your guidance @travissluka !

  1. Yes. My SOCA works and I'm following your tutorial. Plan to do 3D-Var and other things this weekend.
  2. I will open an issue on the spack-stack repo
  3. For the ctesting, I haven't retried yet, I will check git-lfs first (your comment on git-lfs makes sense since my other failed pulling of CRTM also tries to pull large files), and rerun this weekend.

One more question, what is the difference of soca under

Thank you!

travissluka commented 8 months ago

@gmao-cda they are essentially the same. The repos on jcsda-internal are supposed to be private (soca is not, due to historical rebellious reasons). The repos on jcsda are automatically mirrored from the internal repo when there are updates. For consistency you should point to the jcsda repos

gmao-cda commented 8 months ago

@travissluka I have reran the ctest. Turns out the error is caused by float-point exception.

Error traceback

I'm using

After running ctest -R gridgen -V, I got:

(CDA: omitted)
2: &MPP_IO_NML
2:  HEADER_BUFFER_VAL=16384      ,
2:  GLOBAL_FIELD_ON_ROOT_PE=T,
2:  IO_CLOCKS_ON=F,
2:  SHUFFLE=0          ,
2:  DEFLATE_LEVEL=-1         ,
2:  CF_COMPLIANCE=F,
2:  /
2: NOTE from PE     0: MPP_IO_SET_STACK_SIZE: stack size set to     131072.
2:
2:
2: Floating point exception detected:
2: Floating point exception detected: Invalid operation
2:
2: Invalid operation
2:
2:  0# util::fpe_signal_action(int, siginfo_t*, void*) at /home/cda/fast/work/soca-tutorial/jedi-bundle/oops/src/oops/util/signal_trap.cc:228
2:  1# 0x0000147FA7442520 in /lib/x86_64-linux-gnu/libc.so.6
2:  2# H5T__init_native_float_types in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/hdf5-1.14.3-bp2q3oc/lib/libhdf5.so.310
2:  3# H5T_init in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/hdf5-1.14.3-bp2q3oc/lib/libhdf5.so.310
2:  4# H5VL_init_phase2 in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/hdf5-1.14.3-bp2q3oc/lib/libhdf5.so.310
2:  5# H5_init_library in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/hdf5-1.14.3-bp2q3oc/lib/libhdf5.so.310
2:  6# H5Eset_auto2 in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/hdf5-1.14.3-bp2q3oc/lib/libhdf5.so.310
2:  7# nc4_hdf5_initialize in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/netcdf-c-4.9.2-y7oy5nz/lib/libnetcdf.so.19
2:  8# NC_HDF5_initialize in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/netcdf-c-4.9.2-y7oy5nz/lib/libnetcdf.so.19
2:  9# nc_initialize in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/netcdf-c-4.9.2-y7oy5nz/lib/libnetcdf.so.19
2: 10# NC_open in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/netcdf-c-4.9.2-y7oy5nz/lib/libnetcdf.so.19
2: 11# nc__open in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/netcdf-c-4.9.2-y7oy5nz/lib/libnetcdf.so.19
2: 12# nf__open_ at /home/cda/fast/pkg/spack-stack/cache/build_stage/spack-stage-netcdf-fortran-4.6.1-ttr5vajwwlnm7b272c7namf2ci37n65u/spack-src/fortran/nf_control.F90:230
2: 13# __mpp_io_mod_MOD_mpp_open in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/fms-release-jcsda-77qcryz/lib/libfms.so
2: 14# __fms_io_mod_MOD_get_file_unit.constprop.0 in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/fms-release-jcsda-77qcryz/lib/libfms.so
2: 15# __fms_io_mod_MOD_field_exist in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/fms-release-jcsda-77qcryz/lib/libfms.so
2: 16# __fms_io_mod_MOD_fms_io_init.part.0 in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/fms-release-jcsda-77qcryz/lib/libfms.so
2: 17# __fms_mod_MOD_fms_init.part.0 in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/fms-release-jcsda-77qcryz/lib/libfms.so
2: 18# __soca_mom6_MOD_soca_geomdomain_init at /home/cda/fast/work/soca-tutorial/jedi-bundle/soca/src/soca/Model/mom6solo/soca_mom6.F90:95
2: 19# __soca_geom_mod_MOD_soca_geom_init at /home/cda/fast/work/soca-tutorial/jedi-bundle/soca/src/soca/Geometry/soca_geom_mod.F90:187
2: 20# soca_geo_setup_f90 at /home/cda/fast/work/soca-tutorial/jedi-bundle/soca/src/soca/Geometry/soca_geom.interface.F90:58
2: 21# soca::Geometry::Geometry(eckit::Configuration const&, eckit::mpi::Comm const&, bool) at /home/cda/fast/work/soca-tutorial/jedi-bundle/soca/src/soca/Geometry/Geometry.cc:37
2: 22# soca::GridGen::execute(eckit::Configuration const&, bool) const at /home/cda/fast/work/soca-tutorial/jedi-bundle/soca/src/soca/../mains/GridGen.h:36
2: 23# oops::Run::execute(oops::Application const&, eckit::mpi::Comm const&) at /home/cda/fast/work/soca-tutorial/jedi-bundle/oops/src/oops/runs/Run.cc:185
2: 24# main at /home/cda/fast/work/soca-tutorial/jedi-bundle/soca/src/mains/GridGen.cc:18
2: 25# __libc_start_call_main at ../sysdeps/nptl/libc_start_call_main.h:58
2: 26# __libc_start_main at ../csu/libc-start.c:379
2: 27# _start in /home/cda/fast/work/soca-tutorial/jedi-bundle/build/bin/soca_gridgen.x
2:
2:  0# util::fpe_signal_action(int, siginfo_t*, void*) at /home/cda/fast/work/soca-tutorial/jedi-bundle/oops/src/oops/util/signal_trap.cc:228
2:  1# 0x0000149CDEA42520 in /lib/x86_64-linux-gnu/libc.so.6
2:  2# H5T__init_native_float_types in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/hdf5-1.14.3-bp2q3oc/lib/libhdf5.so.310
2:  3# H5T_init in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/hdf5-1.14.3-bp2q3oc/lib/libhdf5.so.310
2:  4# H5VL_init_phase2 in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/hdf5-1.14.3-bp2q3oc/lib/libhdf5.so.310
2:  5# H5_init_library in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/hdf5-1.14.3-bp2q3oc/lib/libhdf5.so.310
2:  6# H5Eset_auto2 in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/hdf5-1.14.3-bp2q3oc/lib/libhdf5.so.310
2:  7# nc4_hdf5_initialize in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/netcdf-c-4.9.2-y7oy5nz/lib/libnetcdf.so.19
2:  8# NC_HDF5_initialize in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/netcdf-c-4.9.2-y7oy5nz/lib/libnetcdf.so.19
2:  9# nc_initialize in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/netcdf-c-4.9.2-y7oy5nz/lib/libnetcdf.so.19
2: 10# NC_open in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/netcdf-c-4.9.2-y7oy5nz/lib/libnetcdf.so.19
2: 11# nc__open in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/netcdf-c-4.9.2-y7oy5nz/lib/libnetcdf.so.19
2: 12# nf__open_ at /home/cda/fast/pkg/spack-stack/cache/build_stage/spack-stage-netcdf-fortran-4.6.1-ttr5vajwwlnm7b272c7namf2ci37n65u/spack-src/fortran/nf_control.F90:230
2: 13# __mpp_io_mod_MOD_mpp_open in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/fms-release-jcsda-77qcryz/lib/libfms.so
2: 14# __fms_io_mod_MOD_get_file_unit.constprop.0 in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/fms-release-jcsda-77qcryz/lib/libfms.so
2: 15# __fms_io_mod_MOD_field_exist in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/fms-release-jcsda-77qcryz/lib/libfms.so
2: 16# __fms_io_mod_MOD_fms_io_init.part.0 in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/fms-release-jcsda-77qcryz/lib/libfms.so
2: 17# __fms_mod_MOD_fms_init.part.0 in /home/cda/fast/pkg/spack-stack/envs/soca-env.jojoSOCA/install/gcc/11.4.0/fms-release-jcsda-77qcryz/lib/libfms.so
2: 18# __soca_mom6_MOD_soca_geomdomain_init at /home/cda/fast/work/soca-tutorial/jedi-bundle/soca/src/soca/Model/mom6solo/soca_mom6.F90:95
2: 19# __soca_geom_mod_MOD_soca_geom_init at /home/cda/fast/work/soca-tutorial/jedi-bundle/soca/src/soca/Geometry/soca_geom_mod.F90:187
2: 20# soca_geo_setup_f90 at /home/cda/fast/work/soca-tutorial/jedi-bundle/soca/src/soca/Geometry/soca_geom.interface.F90:58
2: 21# soca::Geometry::Geometry(eckit::Configuration const&, eckit::mpi::Comm const&, bool) at /home/cda/fast/work/soca-tutorial/jedi-bundle/soca/src/soca/Geometry/Geometry.cc:37
2: 22# soca::GridGen::execute(eckit::Configuration const&, bool) const at /home/cda/fast/work/soca-tutorial/jedi-bundle/soca/src/soca/../mains/GridGen.h:36
2: 23# oops::Run::execute(oops::Application const&, eckit::mpi::Comm const&) at /home/cda/fast/work/soca-tutorial/jedi-bundle/oops/src/oops/runs/Run.cc:185
2: 24# main at /home/cda/fast/work/soca-tutorial/jedi-bundle/soca/src/mains/GridGen.cc:18
2: 25# __libc_start_call_main at ../sysdeps/nptl/libc_start_call_main.h:58
2: 26# __libc_start_main at ../csu/libc-start.c:379
2: 27# _start in /home/cda/fast/work/soca-tutorial/jedi-bundle/build/bin/soca_gridgen.x
2:
2: ABORT: Trapped a floating point exception
2:        in file '/home/cda/fast/work/soca-tutorial/jedi-bundle/oops/src/oops/util/signal_trap.cc', line 254
2: ABORT: Trapped a floating point exception
2:        in file '/home/cda/fast/work/soca-tutorial/jedi-bundle/oops/src/oops/util/signal_trap.cc', line 254
2: Abort(1) on node 1 (rank 1 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 1) - process 1
2:
2: Floating point exception detected: Invalid operation
2:

My solutions to pass gridgen test

Follow your suggestion https://github.com/JCSDA-internal/soca/issues/830, I added NOTRAPFPE to gridgen test in soca/test/CMakeLists.txt Then I passed this test.

My solution to pass all tests

I let set ( TRAPFPE_ENV "OOPS_TRAPFPE=0") in soca/test/CMakeLists.txt (I know it's ugly...)

  #if ( ARG_NOTRAPFPE AND NOT SOCA_TESTS_FORC_TRAPFPE)
    set ( TRAPFPE_ENV "OOPS_TRAPFPE=0")
  #else()
  #  set ( TRAPFPE_ENV "OOPS_TRAPFPE=1")
  #endif()

Then I pass all tests

cda@jojo:~/fast/work/soca-tutorial/jedi-bundle/build_NOTRAPFPE/soca$ ctest
Test project /home/cda/fast/work/soca-tutorial/jedi-bundle/build_NOTRAPFPE/soca
      Start  1: soca_coding_norms
 1/76 Test  #1: soca_coding_norms ............................   Passed    0.93 sec
      Start  2: test_soca_gridgen
 2/76 Test  #2: test_soca_gridgen ............................   Passed    0.23 sec
      Start  3: test_soca_geometry
 3/76 Test  #3: test_soca_geometry ...........................   Passed    0.11 sec
      Start  4: test_soca_geometry_iterator_2d
 4/76 Test  #4: test_soca_geometry_iterator_2d ...............   Passed    0.11 sec
      Start  5: test_soca_geometry_iterator_3d
 5/76 Test  #5: test_soca_geometry_iterator_3d ...............   Passed    0.23 sec
      Start  6: test_soca_state
 6/76 Test  #6: test_soca_state ..............................   Passed    0.55 sec
      Start  7: test_soca_increment
 7/76 Test  #7: test_soca_increment ..........................   Passed    0.17 sec
      Start  8: test_soca_model
 8/76 Test  #8: test_soca_model ..............................   Passed    0.50 sec
      Start  9: test_soca_modelaux
 9/76 Test  #9: test_soca_modelaux ...........................   Passed    0.10 sec
      Start 10: test_soca_getvalues
10/76 Test #10: test_soca_getvalues ..........................   Passed    0.11 sec
      Start 31: test_soca_static_socaerror_init
11/76 Test #31: test_soca_static_socaerror_init ..............   Passed    2.11 sec
      Start 11: test_soca_errorcovariance
12/76 Test #11: test_soca_errorcovariance ....................   Passed    0.22 sec
      Start 12: test_soca_linearmodel
13/76 Test #12: test_soca_linearmodel ........................   Passed    0.35 sec
      Start 13: test_soca_varchange_ana2model
14/76 Test #13: test_soca_varchange_ana2model ................   Passed    0.11 sec
      Start 14: test_soca_varchange_balance
15/76 Test #14: test_soca_varchange_balance ..................   Passed    0.16 sec
      Start 15: test_soca_varchange_balance_TSSSH
16/76 Test #15: test_soca_varchange_balance_TSSSH ............   Passed    0.21 sec
      Start 16: test_soca_varchange_bkgerrfilt
17/76 Test #16: test_soca_varchange_bkgerrfilt ...............   Passed    0.12 sec
      Start 17: test_soca_varchange_horizfilt
18/76 Test #17: test_soca_varchange_horizfilt ................   Passed    0.14 sec
      Start 18: test_soca_varchange_bkgerrsoca
19/76 Test #18: test_soca_varchange_bkgerrsoca ...............   Passed    0.14 sec
      Start 19: test_soca_varchange_bkgerrsoca_stddev
20/76 Test #19: test_soca_varchange_bkgerrsoca_stddev ........   Passed    0.15 sec
      Start 20: test_soca_varchange_bkgerrgodas
21/76 Test #20: test_soca_varchange_bkgerrgodas ..............   Passed    1.21 sec
      Start 21: test_soca_varchange_vertconv
22/76 Test #21: test_soca_varchange_vertconv .................   Passed    0.18 sec
      Start 22: test_soca_obslocalization
23/76 Test #22: test_soca_obslocalization ....................   Passed    0.13 sec
      Start 23: test_soca_obslocalization_vertical
24/76 Test #23: test_soca_obslocalization_vertical ...........   Passed    0.23 sec
      Start 24: test_soca_obslocalizations
25/76 Test #24: test_soca_obslocalizations ...................   Passed    0.13 sec
      Start 25: test_soca_forecast_identity
26/76 Test #25: test_soca_forecast_identity ..................   Passed    0.24 sec
      Start 26: test_soca_forecast_mom6
27/76 Test #26: test_soca_forecast_mom6 ......................   Passed    0.50 sec
      Start 27: test_soca_forecast_pseudo
28/76 Test #27: test_soca_forecast_pseudo ....................   Passed    0.14 sec
      Start 28: test_soca_forecast_mom6_ens1
29/76 Test #28: test_soca_forecast_mom6_ens1 .................   Passed    0.51 sec
      Start 29: test_soca_forecast_mom6_ens2
30/76 Test #29: test_soca_forecast_mom6_ens2 .................   Passed    0.51 sec
      Start 30: test_soca_forecast_mom6_ens3
31/76 Test #30: test_soca_forecast_mom6_ens3 .................   Passed    0.51 sec
      Start 32: test_soca_static_socaerrorlowres_init
32/76 Test #32: test_soca_static_socaerrorlowres_init ........   Passed    0.35 sec
      Start 33: test_soca_setcorscales
33/76 Test #33: test_soca_setcorscales .......................   Passed    0.11 sec
      Start 34: test_soca_parameters_bump_cor_nicas
34/76 Test #34: test_soca_parameters_bump_cor_nicas ..........   Passed    0.51 sec
      Start 35: test_soca_parameters_bump_cor_nicas_scales
35/76 Test #35: test_soca_parameters_bump_cor_nicas_scales ...   Passed    1.67 sec
      Start 36: test_soca_parameters_bump_loc
36/76 Test #36: test_soca_parameters_bump_loc ................   Passed    0.54 sec
      Start 37: test_soca_parameters_bump_cov
37/76 Test #37: test_soca_parameters_bump_cov ................   Passed    1.68 sec
      Start 38: test_soca_parameters_diffusion_hz
38/76 Test #38: test_soca_parameters_diffusion_hz ............   Passed    1.87 sec
      Start 39: test_soca_parameters_diffusion_vt
39/76 Test #39: test_soca_parameters_diffusion_vt ............   Passed    0.17 sec
      Start 40: test_soca_enspert
40/76 Test #40: test_soca_enspert ............................   Passed    1.27 sec
      Start 41: test_soca_convertstate
41/76 Test #41: test_soca_convertstate .......................   Passed    0.15 sec
      Start 42: test_soca_convertstate_changevar
42/76 Test #42: test_soca_convertstate_changevar .............   Passed    0.18 sec
      Start 43: test_soca_ensmeanandvariance
43/76 Test #43: test_soca_ensmeanandvariance .................   Passed    0.23 sec
      Start 44: test_soca_parametric_stddev
44/76 Test #44: test_soca_parametric_stddev ..................   Passed    0.67 sec
      Start 45: test_soca_ensrecenter
45/76 Test #45: test_soca_ensrecenter ........................   Passed    0.23 sec
      Start 46: test_soca_hybridgain
46/76 Test #46: test_soca_hybridgain .........................   Passed    0.28 sec
      Start 47: test_soca_sqrtvertloc
47/76 Test #47: test_soca_sqrtvertloc ........................   Passed    2.31 sec
      Start 48: test_soca_diffstates
48/76 Test #48: test_soca_diffstates .........................   Passed    0.14 sec
      Start 49: test_soca_dirac_soca_cor_nicas_scales
49/76 Test #49: test_soca_dirac_soca_cor_nicas_scales ........   Passed    0.19 sec
      Start 50: test_soca_dirac_soca_cov
50/76 Test #50: test_soca_dirac_soca_cov .....................   Passed    0.74 sec
      Start 51: test_soca_dirac_socahyb_cov
51/76 Test #51: test_soca_dirac_socahyb_cov ..................   Passed    1.64 sec
      Start 52: test_soca_dirac_horizfilt
52/76 Test #52: test_soca_dirac_horizfilt ....................   Passed    0.15 sec
      Start 53: test_soca_dirac_soca_mask
53/76 Test #53: test_soca_dirac_soca_mask ....................   Passed    1.43 sec
      Start 54: test_soca_dirac_soca_nomask
54/76 Test #54: test_soca_dirac_soca_nomask ..................   Passed    0.41 sec
      Start 55: test_soca_dirac_diffusion
55/76 Test #55: test_soca_dirac_diffusion ....................   Passed    0.32 sec
      Start 56: test_soca_makeobs
56/76 Test #56: test_soca_makeobs ............................   Passed    1.14 sec
      Start 57: test_soca_hofx_3d
57/76 Test #57: test_soca_hofx_3d ............................   Passed    1.06 sec
      Start 58: test_soca_hofx_4d
58/76 Test #58: test_soca_hofx_4d ............................   Passed    1.21 sec
      Start 59: test_soca_hofx_4d_pseudo
59/76 Test #59: test_soca_hofx_4d_pseudo .....................   Passed    0.98 sec
      Start 60: test_soca_3dvar_soca
60/76 Test #60: test_soca_3dvar_soca .........................   Passed    1.18 sec
      Start 61: test_soca_3dvarbump
61/76 Test #61: test_soca_3dvarbump ..........................   Passed    1.09 sec
      Start 62: test_soca_3dvar_diffusion
62/76 Test #62: test_soca_3dvar_diffusion ....................   Passed    2.07 sec
      Start 63: test_soca_3dvar_godas
63/76 Test #63: test_soca_3dvar_godas ........................   Passed    1.51 sec
      Start 64: test_soca_3dvarlowres_soca
64/76 Test #64: test_soca_3dvarlowres_soca ...................   Passed    1.24 sec
      Start 65: test_soca_3dvarfgat
65/76 Test #65: test_soca_3dvarfgat ..........................   Passed    2.06 sec
      Start 66: test_soca_3dvarfgat_pseudo
66/76 Test #66: test_soca_3dvarfgat_pseudo ...................   Passed    1.05 sec
      Start 67: test_soca_3dhyb
67/76 Test #67: test_soca_3dhyb ..............................   Passed    2.05 sec
      Start 68: test_soca_3dhybfgat
68/76 Test #68: test_soca_3dhybfgat ..........................   Passed    2.86 sec
      Start 69: test_soca_4denvar
69/76 Test #69: test_soca_4denvar ............................   Passed    0.63 sec
      Start 70: test_soca_4dhybenvar
70/76 Test #70: test_soca_4dhybenvar .........................   Passed    1.06 sec
      Start 71: test_soca_letkf
71/76 Test #71: test_soca_letkf ..............................   Passed    0.36 sec
      Start 72: test_soca_letkf_split_observer
72/76 Test #72: test_soca_letkf_split_observer ...............   Passed    0.23 sec
      Start 73: test_soca_letkf_split_solver
73/76 Test #73: test_soca_letkf_split_solver .................   Passed    0.26 sec
      Start 74: test_soca_addincrement
74/76 Test #74: test_soca_addincrement .......................   Passed    0.14 sec
      Start 75: test_soca_convertincrement
75/76 Test #75: test_soca_convertincrement ...................   Passed    0.14 sec
      Start 76: test_soca_checkpointmodel
76/76 Test #76: test_soca_checkpointmodel ....................   Passed    0.23 sec

100% tests passed, 0 tests failed out of 76

Label Time Summary:
executable    =   5.35 sec*proc (22 tests)
mpi           =  49.90 sec*proc (75 tests)
script        =  45.47 sec*proc (54 tests)
soca          =  50.82 sec*proc (76 tests)

Total Test time (real) =  50.85 sec

My question is: why ecbuild .. -DARG_NOTRAPFPE=ON does not set TRAPFPE_ENV to "OOPS_TRAPFPE=0"?