Open micaeljtoliveira opened 5 days ago
@micaeljtoliveira I was able to build your gcc_compilation_and_warnings
branch on Gadi with
./build.bash -C gnu --mpi -DCMAKE_Fortran_FLAGS="-ffree-line-length-none"
However the compiled executable crashes with segmentation fault (for both MPI and serial executables). This is probably outside the scope of this pull request but I will document this here for now.
For the serial case I ran the example serial configuration in the CABLE repository:
cd src/offline/
../../bin/cable cable.nml
which outputs:
THE NAME LIST IS cable.nml
Use spatially-specific soil properties; 360 150
0
0
When choosing spatially-specific soil properties,
snow-free albedo is also overwritten by this data set.
Reading CABLE_PFTPARM namelist...
Reading CABLE_SOILPARM namelist...
Total number of patches (countPatch): 1
iveg 1 2
patchfrac 1 1.00000000
Could neither find restart file ./
nor ./
Pre-loaded default initialisations are used.
Loaded some parameters from met input file: TumbaFluxnet.1.3_met.nc the rest are default values - check log file
Reading CABLE_PFTPARM namelist...
Reading CABLE_SOILPARM namelist...
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x7fc420b985af in ???
#1 0x47e4da in ???
#2 0x40d6f8 in ???
#3 0x40321c in ???
#4 0x7fc420b847e4 in ???
#5 0x40325d in ???
#6 0xffffffffffffffff in ???
Segmentation fault
I was able to trace the segmentation fault to the acc_out_var
using the DDT debugger (requires setting CABLE_GNU_Fortran_FLAGS_DEBUG
to -g -O0
and building in the debug configuration).
For the MPI case I ran the crujra_accessN96_1h config which crashes similarly to the serial case.
Building your branch with the Intel compiler and running the executable seems to work fine so it looks like this is specific to the gfortran build.
@mcuntz I recall I used your compiler flags when adding the GNU compiler. The debug flags currently contain -O
(equivalent to -O1
) which enables some optimisations and prevents me from using the DDT debugger. Is there a reason why you use -O
instead of -O0
in the debug flags for gfortran?
@SeanBryan51 -O is there simply for speed even in debug mode. I basically debug with print statements and do not use much the debuggers. No worry to change it.
However, I think that the compilation should work with the pretty pedantic flags. If it works with another compiler means that that compiler assumes something and it is not always what you'd assume if you had coded properly. So I would be 100% for ironing out the places where it is not compiling.
@SeanBryan51 I'm able to reproduce your error. I've compiled cable with the -fbacktrace
option (plus a few others), so I get a more useful error message:
At line 2908 of file /home/micael/workspaces/cable/git/src/offline/cable_input.F90
Fortran runtime warning: An array temporary was created
Loaded some parameters from met input file: TumbaFluxnet.1.3_met.nc the rest are default values - check log file
Reading CABLE_PFTPARM namelist...
Reading CABLE_SOILPARM namelist...
At line 2194 of file /home/micael/workspaces/cable/git/src/offline/cable_output.F90
Fortran runtime error: Array bound mismatch for dimension 2 of array 'acc_val' (6/1)
Error termination. Backtrace:
#0 0x4745bc in generate_out_write_acc_d2
at /home/micael/workspaces/cable/git/src/offline/cable_output.F90:2194
#1 0x477303 in __cable_output_module_MOD_write_output
at /home/micael/workspaces/cable/git/src/offline/cable_output.F90:1769
#2 0x7ab84b in cable_offline_driver
at /home/micael/workspaces/cable/git/src/offline/cable_driver.F90:942
#3 0x40379c in main
at /home/micael/workspaces/cable/git/src/offline/cable_driver.F90:62
Okay, I think I understand the problem, but I'm not sure how you want to fix it.
In the following call in cable_output.F90
:
CALL generate_out_write_acc(output%SoilMoist, ovid%SoilMoist, 'SoilMoist', out%SoilMoist, REAL(ssnow%wb, 4), ranges%SoilMoist, patchout%SoilMoistIce, out_settings)
output%SoilMoist
is set to false and the out%SoilMoist
and ssnow%wb
arguments are passed down to the elemental
function generate_out_write_acc_d2
. At that point, the compiler checks the bounds of those arguments, as it needs to be sure they are compatible, as the function needs to be called element-wise. But output%SoilMoist
is actually never allocated (because output%SoilMoist
is false) and the check fails.
There are basically two solutions:
generate_out_write_acc
when argument output_var
is false before calling acc_out_var
. I would recommend against option 1., as the array bounds check is actually very useful when debugging.
Actually, even with the checks disabled, it fails (as I should have realized from the segfault message obtained by @SeanBryan51).
This means one really cannot call an elemental function over an array that has not been allocated, which makes total sense. I would thus argue that in this case gfortran is actually doing the right thing and it's the Intel compiler that is misbehaving.
There's a few more problematic cases in the same module were several non-allocated arrays are being accessed. For example, here
CALL generate_out_write_acc(output%casa, ovid%PlantTurnover, 'PlantTurnover', out%PlantTurnover, &
REAL((SUM(casaflux%Cplant_turnover, 2))/86400.0/1.201E-5, 4), ranges%NEE, patchout%PlantTurnover, out_settings)
it might happen that casaflux%Cplant_turnover
is not allocated, causing a segfault.
I think the only sensible solution is to fence all calls to generate_out_write_acc
with an if statement. Note that this issue was introduced in #287. @abhaasgoyal FYI.
@micaeljtoliveira thanks for looking into it.
I think the only sensible solution is to fence all calls to generate_out_write_acc with an if statement.
This seems appropriate to me. I'm happy to let you decide if you want to fix this in this pull request.
CABLE
Thank you for submitting a pull request to the CABLE Project.
Description
This PR fixes compilation with gfortran. I also fixes several compiler warnings (mostly about unused variables) in the MPI master/driver modules.
Type of change
Please delete options that are not relevant.
Checklist
📚 Documentation preview 📚: https://cable--410.org.readthedocs.build/en/410/