E3SM-Project / E3SM

Energy Exascale Earth System Model source code. NOTE: use "maint" branches for your work. Head of master is not validated.
https://docs.e3sm.org/E3SM
Other
350 stars 360 forks source link

BUILD_THREADED can no longer disable threading post-CIME #590

Closed worleyph closed 8 years ago

worleyph commented 8 years ago

I am almost certain that the following is a change of behavior post-CIME. It is inconvenient at the very least.

A) 1) after

 ./create_newcase -case XXX -res ne30_m120 -compset A_B1850CN 

in env_build.xml the following is set (by the create_newcase script):

 <entry id="BUILD_THREADED"   value="FALSE"  />

2) next I modified env_mach_pes.xml to

 <entry id="NTHRDS_ATM"   value="2"  />

and

 <entry id="NTHRDS_LND"   value="1"  />
 <entry id="NTHRDS_ICE"   value="1"  />
 <entry id="NTHRDS_OCN"   value="1"  />
 <entry id="NTHRDS_CPL"   value="1"  />
 <entry id="NTHRDS_GLC"   value="1"  />
 <entry id="NTHRDS_ROF"   value="1"  />
 <entry id="NTHRDS_WAV"   value="1"  />

3) after

 ./cesm_setup

the BUILD_THREADED setting was changed to

 <entry id="BUILD_THREADED"   value="TRUE"  />

(by the cesm_setup script).

4) I manually edited env_build.xml to change this back to

 <entry id="BUILD_THREADED"   value="FALSE"  />

5) when compiling, '-mp' was still used when building all of the components (on Titan). On Mira/Cetus, '-qsmp=omp' was used. So, BUILD_THREADED being set to FALSE was ignored for the components with NTHRDS == 1. (It was also ignored for NTHRDS > 1, but that is the expected behavior.)

B) In constrast, if env_mach_pes.xml specifies all MPI, .e.g.

 <entry id="NTHRDS_ATM"   value="1"  />

in the above example, and if env_build.xml is modified to

 <entry id="BUILD_THREADED"   value="TRUE"  />

then '-mp' or '-qsmp=omp' is added to the compile lines for all components.

Summary: BUILD_THREADED can no longer be used to build some components threaded and others not. It can be used to enable OpenMP (for all components) when env_mach_pes.xml would otherwise indicate that an MPI-only but, but not the opposite, nor any mixed MPI-only/MPI+OpenMP builds.

amametjanov commented 8 years ago

Pat, please try out this patch: https://github.com/ACME-Climate/ACME/compare/azamat/scripts/build_threaded

worleyph commented 8 years ago

Didn't work, but looks like you are close. The problem is that -qsmp=omp is not used in the final link step for cesm.exe, so any components built with OpenMP enabled will have undefined references, e.g.

 ACME/components/cam/src/control/cam_history.F90:4433: undefined reference to `_xlsmpParSelf'
jayeshkrishna commented 8 years ago

@worleyph : If this error (undefined reference to SMP libs) is on Mira, it will be fixed by PR #611

worleyph commented 8 years ago

I think that this is a different issue. @amametjanov fixed the build logic to remove -qsmp=omp for components that do not need to be built with threading. Unfortumately this initial implementation also removed -qsmp=omp from the final link step. So, it is not an "error" in that the build logic is doing what it is currently being asked to do. The logic just needs to be changed a little?

Or maybe not. Sorry - just looked at the PR. Perhaps this is exactly the same thing.

@amametjanov , I'll try again, but now include the mods in PR #611 .

worleyph commented 8 years ago

@amametjanov , experiments on Cori show that the threading build logic is working with your modifications. I haven't finished the same tests on Mira or Cetus (compilation is pretty slow at the moment), but I think that it is working there as well. I'd go ahead and submit a pull request. Thanks for doing this.

amametjanov commented 8 years ago

Great. Please let me know if there are any problems with threading.