Closed mt5555 closed 9 years ago
Formatting is all messed up. Better to do a "svn diff" in the HOMME repository
tracked down the problem to omp parallel do loop over q at line 807. commenting out this line restores conservation.
not caused by limiter_option==8 code though. non conservation remains even when this code is commented out.
To clarify issue #11: conservation is lost when COLUMN_OMP is turned on. mass numbers look like this: less out_ne8_tests_jobid.edique02 | grepQ1 Q1,Q diss, dQ^2/dt: 0.41161666941530E+02 kg/m^2 -0.2368476E-16 -0.1315974E-04 Q1,Q diss, dQ^2/dt: 0.41184360483096E+02 kg/m^2 0.1891128E-04 0.1266887E-04 Q1,Q diss, dQ^2/dt: 0.52292195250432E+02 kg/m^2 0.2403763E-04 -0.5090022E-06 Q1,Q diss, dQ^2/dt: 0.66429692231790E+02 kg/m^2 0.3057929E-04 0.1945205E-04
when the parallel do over q at line 807 is commented out. the output looks like this Q1,Q diss, dQ^2/dt: 0.41161666941530E+02 kg/m^2 -0.2368476E-16 -0.1315974E-04 Q1,Q diss, dQ^2/dt: 0.41161666941530E+02 kg/m^2 -0.5921189E-17 -0.1222287E-04 Q1,Q diss, dQ^2/dt: 0.41161666941529E+02 kg/m^2 0.0000000E+00 -0.1323531E-04 Q1,Q diss, dQ^2/dt: 0.41161666941528E+02 kg/m^2 0.0000000E+00 -0.1472920E-05
have traced the problem to line 817: if ( rhs_viss /= 0 ) Qtens(:,:,k) = Qtens(:,:,k) + Qtens_biharmonic(:,:,k,q,ie)
if this line is excluded from the omp parallel do, the code works fine. do not yet know why. suspected memory corruption due to stack size? but setting OMP_STACKSIZE=1G did not help.
Any ideas?
So when you say excluced from the omp parallel do, that means the line is still being executed, but outside the threaded region. Hence we know that Qtens_biharmonic() is actually correct and there are not any threading errors related to its calculation.
Another (ridiculous, I know) suggestion: is openMP case senstive? i.e. 'qtens' is thread private, but the fortran is using Qtens.
What about giving Qtens a "q" dimension, so that it does not have to be made omp private? I guess if this works, it seems like it's just a workaround for an openMP bug.
For "fun" you might replace the array syntax notation with explicit do-loops (with "omp private" loop indices), and see if this then works in the openmp loop. This could just be a compiler bug also - not unknown. Can't see why any temporaries would be created for this expression, but I do not know what goes on under the covers.
Pat
On 7/23/15 7:02 PM, David M. Hall wrote:
have traced the problem to line 817: if ( rhs_viss /= 0 ) Qtens(:,:,k) = Qtens(:,:,k) + Qtens_biharmonic(:,:,k,q,ie) if this line is excluded from the omp parallel do, the code works fine. do not yet know why. suspected memory corruption due to stack size? but setting OMP_STACKSIZE=1G did not help. Any ideas? --- Reply to this email directly or view it on GitHub: hxxps://github.com/ACME-Climate/transport_se/issues/11#issuecomment-124261770
Traced the bug to lines 817 and 673. Removing the static initialization in integer :: rhs_viss=0 fixes the problem. Seems to be a compiler bug? Openmp and the implied "save" keyword are no playing well together?
This recent patch from John Dennis in HOMME (see below) might be related to this bug.
dennis Fixed issues with COLUMN_OPENMP in the Eulerian Advection code.
Modified: trunk/src/share/prim_advection_mod.F90
--- trunk/src/share/prim_advection_mod.F90 2015-07-10 03:38:41 UTC (rev 4738) +++ trunk/src/share/prim_advection_mod.F90 2015-07-16 15:27:16 UTC (rev 4739) @@ -1,7 +1,7 @@
ifdef HAVE_CONFIG_H
include "config.h"
endif
-#define NEWEULER_B4B 1 +#undef NEWEULER_B4B
define OVERLAP 1
@@ -2355,11 +2355,14 @@ do ie = nets , nete ! add hyperviscosity to RHS. apply to Q at timelevel n0, Qdp(n0)/dp
if (defined COLUMN_OPENMP)
-!$omp parallel do private(k, q) +!$omp parallel do private(k)
endif
+#if (defined COLUMN_OPENMP) +!$omp parallel do private(q,k) +#endif do q = 1 , qsize do k=1,nlev Qtens_biharmonic(:,:,k,q,ie) = elem(ie)%state%Qdp(:,:,k,q,n0_qdp)/dp(:,:,k) @@ -2390,10 +2393,10 @@ ! nu_p>0): qtens_biharmonc *= elem()%psdiss_ave (for consistency, if nu_p=nu_q) if ( nu_p > 0 ) then do ie = nets , nete +#ifdef NEWEULER_B4B
if (defined COLUMN_OPENMP)
endif
-#ifdef NEWEULER_B4B do k = 1 , nlev dpdiss(:,:) = elem(ie)%derived%dpdiss_ave(:,:,k) do q = 1 , qsize @@ -2402,9 +2405,15 @@ enddo enddo
else
+#if (defined COLUMN_OPENMP)
if (defined COLUMN_OPENMP)
endif
do q = 1 , qsize do k = 1 , nlev ! Loop inversion (AAM) @@ -2447,7 +2456,7 @@ do ie = nets , nete
if (defined COLUMN_OPENMP)
endif
do q = 1 , qsize do k = 1 , nlev ! Loop inversion (AAM) @@ -2476,7 +2485,7 @@ ! Compute velocity used to advance Qdp
if (defined COLUMN_OPENMP)
-!$omp parallel do private(k)
endif
do k = 1 , nlev ! Loop index added (AAM) ! derived variable divdp_proj() (DSS'd version of divdp) will only be correct on 2nd and 3rd stage @@ -2487,6 +2496,9 @@ enddo if ( limiter_option == 8) then ! Note that the term dpdissk is independent of Q +#if (defined COLUMN_OPENMP)
if (defined COLUMN_OPENMP)
-!$omp parallel do private(q,k,gradQ,dp_star,qtens,dpdiss)
endif
do q = 1 , qsize do k = 1 , nlev ! dp_star used as temporary instead of divdp (AAM) @@ -2561,9 +2573,9 @@ call edgeVpack(edgeAdvp1 , elem(ie)%state%Qdp(:,:,:,q,np1_qdp) , nlev , kptr , ie ) enddo ! also DSS extra field -#if (defined COLUMN_OPENMP) -!$omp parallel do private(k) -#endif +!JMD#if (defined COLUMN_OPENMP) +!JMD !$omp parallel do private(k) +!JMD#endif do k = 1 , nlev DSSvar(:,:,k) = elem(ie)%spheremp(:,:) * DSSvar(:,:,k) enddo @@ -2578,12 +2590,15 @@ if ( DSSopt == DSSdiv_vdp_ave ) DSSvar => elem(ie)%derived%divdp_proj(:,:,:) call edgeVunpack( edgeAdvp1 , DSSvar(:,:,1:nlev) , nlev , qsize*nlev , ie ) +#if (defined COLUMN_OPENMP)
if (defined COLUMN_OPENMP)
endif
do q = 1 , qsize kptr = nlev*(q-1)
Homme mailing list Homme@ucar.edu http://mailman.ucar.edu/mailman/listinfo/homme