Open mnlevy1981 opened 5 years ago
It turns out that Travis is running an old version of gfortran (4.8.4) and I'm having trouble updating to something more recent while still supporting netCDF. This issue is reproducible on hobart:
$ cd $MARBL_ROOT/tests/driver_src
$ module load compiler/gnu/4.8.5
$ export PATH=/usr/local/netcdf-c-4.6.1-f-4.4.4-gcc-g++-gfortran-4.8.5/bin:$PATH
$ export LD_LIBRARY_PATH=/usr/local/netcdf-c-4.6.1-f-4.4.4-gcc-g++-gfortran-4.8.5/lib:$LD_LIBRARY_PATH
$ make USE_NETCDF=TRUE
$ cd ../regression_tests/compute_cols/
$ ../../driver_exe/marbl.exe -i ../../input_files/settings/marbl_with_o2_consumption_scalef.input -n test.nml
An easy way to see the problem is to look at the variable photoC_diat_zint
in the history output. From the above set of commands:
$ ncdump -v photoC_diat_zint history_1inst.nc | tail -n 4
data:
photoC_diat_zint = 0, 1.0142560690187e-311, 0, Infinity, Infinity ;
}
Meanwhile, just running
$ ./compute_cols.py --compiler gnu # v8.1.0
$ ncdump -v photoC_diat_zint history_1inst.nc | tail -n 4
photoC_diat_zint = 0.0253156906580491, 0.0322866652495998, 0,
0.0054540503475565, 0.00624035209750951 ;
}
Although it's worth noting that NAG doesn't like something:
$ ./compute_cols.py
...
(run_exe): Running following command:
(run_exe): /home/mlevy/codes/MARBL/tests/driver_exe/marbl.exe -n test.nml -i ../../input_files/settings/marbl_with_o2_consumption_scalef.input
Beginning compute_cols test...
Runtime Error: *** Arithmetic exception: Floating invalid operation - aborting
(run_exe): ERROR in executable
So maybe this is actually a Fortran issue that is being masked by some compilers?
Although it's worth noting that NAG doesn't like something:
$ ./compute_cols.py ... (run_exe): Running following command: (run_exe): /home/mlevy/codes/MARBL/tests/driver_exe/marbl.exe -n test.nml -i ../../input_files/settings/marbl_with_o2_consumption_scalef.input Beginning compute_cols test... Runtime Error: *** Arithmetic exception: Floating invalid operation - aborting (run_exe): ERROR in executable
So maybe this is actually a Fortran issue that is being masked by some compilers?
Per 4365a66 this was a problem with how nag handled netCDF fill values and is unrelated to the gfortran 4.8 problems.
I think this is related to the gfortran bug Keith reported. We have an associate statement in comp_co2calc_coeffs()
that looks like
associate( &
k0 => co2calc_coeffs(:)%k0, &
k1 => co2calc_coeffs(:)%k1, &
k2 => co2calc_coeffs(:)%k2, &
ff => co2calc_coeffs(:)%ff, &
kw => co2calc_coeffs(:)%kw, &
kb => co2calc_coeffs(:)%kb, &
ks => co2calc_coeffs(:)%ks, &
kf => co2calc_coeffs(:)%kf, &
k1p => co2calc_coeffs(:)%k1p, &
k2p => co2calc_coeffs(:)%k2p, &
k3p => co2calc_coeffs(:)%k3p, &
ksi => co2calc_coeffs(:)%ksi, &
bt => co2calc_coeffs(:)%bt, &
st => co2calc_coeffs(:)%st, &
ft => co2calc_coeffs(:)%ft, &
temp => co2calc_state_in(:)%temp, &
salt => co2calc_state_in(:)%salt &
)
and if I try to print out salt
vs co2calc_state_in(:)%salt
I get the following in NAG
salt associate 33.5247993469238281 33.8219146728515625 30.6386947631835938 36.8080024719238281 34.5447998046875000
salt in type 33.5247993469238281 33.8219146728515625 30.6386947631835938 36.8080024719238281 34.5447998046875000
but something totally different from old GNU
salt associate 33.524799346923828 2156.2094726562500 2347.8051757812500 1.7406085729598999 19.125383377075195
salt in type 33.524799346923828 33.821914672851562 30.638694763183594 36.808002471923828 34.544799804687500
So
See this toy repo for details on both points.
I have opened #19 to reference the gfortran bug. As for .travis.yml
, I haven't made any changes regarding how the results are reported (yet)
Yesterday I noticed that travis tests weren't updating... I found a forum post and followed the recommendation of revoking rights then reinstating them, and that seemed to fix things.
Can we split up
run_all_tests.sh
to make it a little easier to find output from tests that fail? I'm thinking something likerun_all_infrastructure_tests.sh
to make sure all the python code worksrun_build_tests.sh
to make sure the build system worksrun_examples.sh
to make sure the "tests" that are just "run the model in different configurations" sanity checks workrun_unit_tests.sh
to run the unit testsrun_regression_tests.sh
to runcompute_cols
(currently the only true regression test, or will be once the baseline is committed to the repo)