Closed uqngibbo closed 1 year ago
Test results with commit 8b5e6e24, using the example code in examples/eilmer/2D/flat-plate-turbulent-mabey/su2-steady-state-solver, compiled with debug mode and run with a single thread:
Execution speed (three trials):
$ time e4-nk-shared --job=mabey --verbosity=1 --max-cpus=1
With -L-export-dynamic -link-defaultlib-debug
real 0m38.405s, 0m38.386s, 0m38.046s
Without (reference commit 8b5e6e24)
real 0m36.413s, 0m37.048s, 0m37.270s
File sizes:
WITH WITHOUT
e4mpi* 28M 17M
e4-nk-dist* 30M 19M
e4-nk-dist-real* 29M 17M
e4-nk-shared* 30M 18M
e4-nk-shared-real* 29M 17M
e4shared* 28M 17M
e4zmpi* 30M 18M
e4zshared* 30M 18M
Note that the above stack trace does not have line numbers/filenames in it. This is because of a bug in older ldc2 compilers:
compile v1.24 and 1.28:
object.Error@(0): RIP
----------------
??:? @nogc void ufluidblock.UFluidBlock.convective_flux_phase2(bool, ulong, fvcell.FVCell[], fvinterface.FVInterface[], fvvertex.FVVertex[]) [0x562ca66d6ad4]
Compile with v1.32:
object.Error@(0): RIP
----------------
ufluidblock.d:887 @nogc void ufluidblock.UFluidBlock.convective_flux_phase2(bool, ulong, fvcell.FVCell[], fvinterface.FVInterface[], fvvertex.FVVertex[]) [0x564391322942]
A workaround for older compilers is to use the full path:
$ /home/uqngibbo/programs/gdtk/bin/e4-nk-shared --job=mabey --verbosity=1 --max-cpus=1
object.Error@(0): RIP
----------------
/home/uqngibbo/source/gdtk.fresh/src/eilmer/ufluidblock.d:887 @nogc void ufluidblock.UFluidBlock.convective_flux_phase2(bool, ulong, fvcell.FVCell[], fvinterface.FVInterface[], fvvertex.FVVertex[]) [0x559d54b31ad4]
I think that the above noted performance penalty and exe file sizes are acceptable in debug mode, so I have added the new flags in commit 34851ea4.
By default the LDC compiler doesn't include the symbols needed for a nice stack trace, even when compiling with -g
Example:
This LDC issue (https://github.com/ldc-developers/ldc/issues/863) explains that adding -L-export-dynamic to the compile returns the expected output, which it does:
The stated reason for this is that the option increases the size of the executable (it's off by default), but I have found that this only changes the size of e4-nk-shared from 18M to 25M. If there's no performance penalty for doing so, I propose including -L-export-dynamic in the debug flavour of the code.