Issue #484 :
As I was improving the checker logic and make the error messages more meaningful, I realized there were many confusing and overlapping conditions, as well as code duplications both within files and across the m_checker.f files in pre_process, simulation, and post_process. This is an attempt to clean everything up by refactoring the code:
Exact common checks by pre_process, simulation, and post_process into src/common/m_checker_common.fpp
Group checks into subroutines
Rewrote all error messages so they are clear regarding what the problem is (including m_check_patches.fpp)
Improve check logic
Remove code duplication using fpp
So after extracting common checks by the three stages, simplifying logic, and refactoring:
1841 -> 1218 lines
and that's with adding many more error messages.
Issue #482 :
This would be a simple search and replace as the functions to compare floating point numbers are already in place. However, 107 changes across 17 files are required, so I think it's cleaner to do this in a separate PR.
Examples of changes
1. Make code more concise
OLD
if (fd_order /= dflt_int &
.and. &
fd_order /= 1 .and. fd_order /= 2 .and. fd_order /= 4) then
call s_mpi_abort('Unsupported choice for the value of '// &
'fd_order. Exiting ...')
end if
NEW
if (all(fd_order /= (/dflt_int, 1, 2, 4/))) then
call s_mpi_abort('fd_order must be 1, 2, or 4. Exiting ...')
end if
2. Reduce redundant logic: the (n == 0 .and. p /= 0) case is caught elsewhere
OLD
if (n == 0 .and. mom_wrt(2)) then
call s_mpi_abort('Unsupported choice of the combination of '// &
'values for n and mom_wrt(2). Exiting ...')
elseif (n == 0 .and. mom_wrt(3)) then
call s_mpi_abort('Unsupported cohice of the combination of '// &
'values for n and mom_wrt(3). Exiting ...')
elseif (p == 0 .and. mom_wrt(3)) then
call s_mpi_abort('Unsupported choice of the combination of '// &
'values for p and mom_wrt(3). Exiting ...')
...
NEW
if (n == 0 .and. mom_wrt(2)) then
call s_mpi_abort('mom_wrt(2) is not supported for n = 0. Exiting ...')
elseif (p == 0 .and. mom_wrt(3)) then
call s_mpi_abort('mom_wrt(3) is not supported for p = 0. Exiting ...')
end if
3. Abstraction using fpp
Note:
fixes (bc == 15) typo
the default value of bc_x%vb1, etc. are 0d0, which is kept untouched
OLD
! Moving Boundaries Checks: x boundaries
if (any((/bc_x%vb1, bc_x%vb2, bc_x%vb3/) /= 0d0)) then
if (bc_x%beg == 15) then
if (any((/bc_x%vb2, bc_x%vb3/) /= 0d0)) then
call s_mpi_abort("Unsupported combination of bc_x%beg and"// &
"bc_x%vb2 or bc_x%vb3. Exiting ...")
end if
elseif (bc_x%beg /= -16) then
call s_mpi_abort("Unsupported combination of bc_x%beg and"// &
"bc_x%vb1, bc_x%vb2, or bc_x%vb3. Exiting...")
end if
end if
if (any((/bc_x%ve1, bc_x%ve2, bc_x%ve3/) /= 0d0)) then
if (bc_x%end == 15) then
if (any((/bc_x%ve2, bc_x%ve3/) /= 0d0)) then
call s_mpi_abort("Unsupported combination of bc_x%end and"// &
"bc_x%ve2 or bc_x%ve3. Exiting ...")
end if
elseif (bc_x%end /= -16) then
call s_mpi_abort("Unsupported combination of bc_x%end and"// &
"bc_x%ve1, bc_x%ve2, or bc_x%ve3. Exiting...")
end if
end if
! Moving Boundaries Checks: y boundaries
if (any((/bc_y%vb1, bc_y%vb2, bc_y%vb3/) /= 0d0)) then
if (bc_y%beg == 15) then
if (any((/bc_y%vb1, bc_y%vb3/) /= 0d0)) then
call s_mpi_abort("Unsupported combination of bc_y%beg and"// &
"bc_y%vb1 or bc_y%vb3. Exiting ...")
end if
elseif (bc_y%beg /= -16) then
call s_mpi_abort("Unsupported combination of bc_y%beg and"// &
"bc_y%vb1, bc_y%vb2, or bc_y%vb3. Exiting...")
end if
end if
if (any((/bc_y%ve1, bc_y%ve2, bc_y%ve3/) /= 0d0)) then
if (bc_y%end == 15) then
if (any((/bc_y%ve1, bc_y%ve3/) /= 0d0)) then
call s_mpi_abort("Unsupported combination of bc_y%end and"// &
"bc_y%ve1 or bc_y%ve3. Exiting ...")
end if
elseif (bc_y%end /= -16) then
call s_mpi_abort("Unsupported combination of bc_y%end and"// &
"bc_y%ve1, bc_y%ve2, or bc_y%ve3. Exiting...")
end if
end if
! Moving Boundaries Checks: z boundaries
if (any((/bc_z%vb1, bc_z%vb2, bc_z%vb3/) /= 0d0)) then
if (bc_z%beg == 15) then
if (any((/bc_x%vb1, bc_x%vb2/) /= 0d0)) then
call s_mpi_abort("Unsupported combination of bc_z%beg and"// &
"bc_x%vb1 or bc_x%vb1. Exiting ...")
end if
elseif (bc_z%beg /= -16) then
call s_mpi_abort("Unsupported combination of bc_z%beg and"// &
"bc_z%vb1, bc_z%vb2, or bc_z%vb3. Exiting...")
end if
end if
if (any((/bc_z%ve1, bc_z%ve2, bc_z%ve3/) /= 0d0)) then
if (bc_z%end == 15) then
if (any((/bc_x%ve1, bc_x%ve2/) /= 0d0)) then
call s_mpi_abort("Unsupported combination of bc_z%end and"// &
"bc_z%ve2 or bc_z%ve3. Exiting ...")
end if
elseif (bc_z%end /= -16) then
call s_mpi_abort("Unsupported combination of bc_z%end and"// &
"bc_z%ve1, bc_z%ve2, or bc_z%ve3. Exiting...")
end if
end if
NEW
#:for X, VB2, VB3 in [('x', 'vb2', 'vb3'), ('y', 'vb3', 'vb1'), ('z', 'vb1', 'vb2')]
if (any((/bc_${X}$%vb1, bc_${X}$%vb2, bc_${X}$%vb3/) /= 0d0)) then
if (bc_${X}$%beg == -15) then
if (any((/bc_${X}$%${VB2}$, bc_${X}$%${VB3}$/) /= 0d0)) then
call s_mpi_abort("bc_${X}$%beg must be -15 if "// &
"bc_${X}$%${VB2}$ or bc_${X}$%${VB3}$ "// &
"is set. Exiting ...")
end if
elseif (bc_${X}$%beg /= -16) then
call s_mpi_abort("bc_${X}$%beg must be -15 or -16 if "// &
"bc_${X}$%vb[1,2,3] is set. Exiting ...")
end if
end if
#:endfor
#:for X, VE2, VE3 in [('x', 've2', 've3'), ('y', 've3', 've1'), ('z', 've1', 've2')]
if (any((/bc_${X}$%ve1, bc_${X}$%ve2, bc_${X}$%ve3/) /= 0d0)) then
if (bc_${X}$%end == -15) then
if (any((/bc_${X}$%${VE2}$, bc_${X}$%${VE3}$/) /= 0d0)) then
call s_mpi_abort("bc_${X}$%end must be -15 if "// &
"bc_${X}$%${VE2}$ or bc_${X}$%${VE3}$ "// &
"is set. Exiting ...")
end if
elseif (bc_${X}$%end /= -16) then
call s_mpi_abort("bc_${X}$%end must be -15 or -16 if "// &
"bc_${X}$%ve[1,2,3] is set. Exiting ...")
end if
end if
#:endfor
4. More robust comparison with dflt_real that works with arrays
OLD
if (any(patch_icpp(patch_id)%alpha_rho /= dflt_real)) then
call s_mpi_abort('alpha_rho must not be altered for inactive '// &
'patch '//trim(iStr)//'. Exiting ...')
NEW
if (.not. f_all_default(patch_icpp(patch_id)%alpha_rho)) then
call s_mpi_abort('alpha_rho must not be altered for inactive '// &
'patch '//trim(iStr)//'. Exiting ...')
Type of change
[x] Something else
Scope
[x] This PR comprises a set of related changes with a common goal
How Has This Been Tested?
[x] All tests passed (including post-processing) on CPUs locally
[x] All tests passed (including post-processing) on GPUs on Delta
[x] Produced intended error messages with various incorrect case files.
Test Configuration:
What computers and compilers did you use to test this:
CPU: Ubuntu 22.04.4 LTS; GNU v11.4.0
GPU: Delta
Checklist
[x] I have added comments for the new code
[x] I added Doxygen docstrings to the new code
[x] I have made corresponding changes to the documentation (docs/) (not applicable)
[x] I have added regression tests to the test suite so that people can verify in the future that the feature is behaving as expected
[x] I have added example cases in examples/ that demonstrate my new feature performing as expected.
They run to completion and demonstrate "interesting physics" (not applicable)
[x] I ran ./mfc.sh format before committing my code
[x] New and existing tests pass locally with my changes, including with GPU capability enabled (both NVIDIA hardware with NVHPC compilers and AMD hardware with CRAY compilers) and disabled (not for AMD hardware)
[x] This PR does not introduce any repeated code (it follows the DRY principle)
[x] I cannot think of a way to condense this code and reduce any introduced additional line count
If your code changes any code source files (anything in src/simulation)
To make sure the code is performing as expected on GPU devices, I have:
[x] Checked that the code compiles using NVHPC compilers
[ ] Checked that the code compiles using CRAY compilers
[x] Ran the code on either V100, A100, or H100 GPUs and ensured the new feature performed as expected (the GPU results match the CPU results)
[ ] Ran the code on MI200+ GPUs and ensure the new features performed as expected (the GPU results match the CPU results)
[x] Enclosed the new feature via nvtx ranges so that they can be identified in profiles (not applicable)
[x] Ran a Nsight Systems profile using ./mfc.sh run XXXX --gpu -t simulation --nsys, and have attached the output file (.nsys-rep) and plain text results to this PR
nsys.txt
[ ] Ran an Omniperf profile using ./mfc.sh run XXXX --gpu -t simulation --omniperf, and have attached the output file and plain text results to this PR.
[x] Ran my code using various numbers of different GPUs (1, 2, and 8, for example) in parallel and made sure that the results scale similarly to what happens if you run without the new code/feature
Description
This PR closes #484 and partially resolves #482.
Issue #484 : As I was improving the checker logic and make the error messages more meaningful, I realized there were many confusing and overlapping conditions, as well as code duplications both within files and across the m_checker.f files in pre_process, simulation, and post_process. This is an attempt to clean everything up by refactoring the code:
So after extracting common checks by the three stages, simplifying logic, and refactoring:
and that's with adding many more error messages.
Issue #482 : This would be a simple search and replace as the functions to compare floating point numbers are already in place. However, 107 changes across 17 files are required, so I think it's cleaner to do this in a separate PR.
Examples of changes
1. Make code more concise
OLD
NEW
2. Reduce redundant logic: the (n == 0 .and. p /= 0) case is caught elsewhere
OLD
NEW
3. Abstraction using fpp
Note:
OLD
NEW
4. More robust comparison with dflt_real that works with arrays
OLD
NEW
Type of change
Scope
How Has This Been Tested?
Test Configuration:
Checklist
docs/
) (not applicable)examples/
that demonstrate my new feature performing as expected. They run to completion and demonstrate "interesting physics" (not applicable)./mfc.sh format
before committing my codeIf your code changes any code source files (anything in
src/simulation
)To make sure the code is performing as expected on GPU devices, I have:
nvtx
ranges so that they can be identified in profiles (not applicable)./mfc.sh run XXXX --gpu -t simulation --nsys
, and have attached the output file (.nsys-rep
) and plain text results to this PR nsys.txt./mfc.sh run XXXX --gpu -t simulation --omniperf
, and have attached the output file and plain text results to this PR.