trilinos / Trilinos

Primary repository for the Trilinos Project
https://trilinos.org/
Other
1.19k stars 565 forks source link

Problems with autotester SHA1s passed into get-changed-trilinos-packages.sh in Trilinos auto-tester #3265

Closed bartlettroscoe closed 6 years ago

bartlettroscoe commented 6 years ago

@trilinos/framework, @william76

Next Action Status

PR #3258 merged on 8/14/2018 which should fix this and this seems to be fixed as noted in closed #3133.

Description

There seems to be a problem with the Trilinos autotester in passing in the right range of commits to the get-changed-trilinos-packages.sh script (see #3133 and #3218). The evidence for this is the PR testing iteration for PR #3260 which only changes files under the cmake/std/atdm/ directory yet it triggered the enable of several Trilinos packages as shown in this PR iteration this morning which shows:

loading initial cache file /scratch/trilinos/workspace/trilinos-folder/Trilinos_pullrequest_gcc_4.8.4/packageEnables.cmake
-- Setting Trilinos_ENABLE_ALL_PACKAGES = ON
-- Setting Trilinos_ENABLE_Belos = ON
-- Setting Trilinos_ENABLE_Ifpack2 = ON
-- Setting Trilinos_ENABLE_Piro = ON
-- Setting Trilinos_ENABLE_ROL = ON
-- Setting Trilinos_ENABLE_TpetraCore = ON

My guess is that the Python script is passing in the git SHA1 for the tip of the 'develop' branch instead of the base commit that the topic branch is created off of.

bartlettroscoe commented 6 years ago

@trilinos/framework,

Looking at:

showing the command:

changed_packages_app=Trilinos/commonTools/framework/get-changed-trilinos-packages.sh
${changed_packages_app} ${TRILINOS_SOURCE_SHA} ${TRILINOS_TARGET_SHA} packageEnables.cmake

is ${TRILINOS_SOURCE_SHA} the tip of the 'develop' branch? If it is, this is going to give you the wrong set of difference files.

Looking at the SHA1s for this PR iteration of #3260 reported here, they are:

If you run the script get-changed-trilinos-packages.sh using these SHA1s you get:

$ ./Trilinos/commonTools/framework/get-changed-trilinos-packages.sh \
  45adc90 c96c475 enablePackages.cmake \
  &> get-changed-trilinos-packages.out

$ cmake -P enablePackages.cmake
-- Setting Trilinos_ENABLE_ALL_PACKAGES = ON
-- Setting Trilinos_ENABLE_Belos = ON
-- Setting Trilinos_ENABLE_Ifpack2 = ON
-- Setting Trilinos_ENABLE_Piro = ON
-- Setting Trilinos_ENABLE_ROL = ON
-- Setting Trilinos_ENABLE_TpetraCore = ON

That is not good. So the SHA1 TRILINOS_SOURCE_SHA = c96c475 is the tip of the branch for my PR #3260 where the top of the git graph is:

Graph for TRILINOS_SOURCE_SHA c96c475

* c96c475 "Change ctest drivers from cuda-xxx to cuda-9.2-xxx (TRIL-215)" <rabartl@sandia.gov> [Wed Aug 8 15:11:57 2018 -0600] (15 hours ago)
* 5ecd0f5 "Adjust for all-at-once build being the default (TRIL-198)" <rabartl@sandia.gov> [Wed Aug 8 13:50:12 2018 -0600] (17 hours ago)
* a3363fa "CUDA 9.2 build on white/ride and remove GPU KOKKOS_ARCH for GNU build (TRIL-215)" <rabartl@sandia.gov> [Wed Aug 8 08:35:49 2018 -0600] (20 hours ago)
* e16895a "Disable test Piro_MatrixFreeDecorator_UnitTests_MPI_4 in Intel builds (#2474) (#3252)" <rabartl@sandia.gov> [Tue Aug 7 22:17:52 2018 -0400] (34 hours ago)
...

So what is the other SHA1 TRILINOS_TARGET_SHA = 45adc90?

Graph for TRILINOS_TARGET_SHA = 45adc90:

*   45adc90 "Merge pull request #3249 from trilinos/Fix-3235" <mhoemmen@users.noreply.github.com> [Wed Aug 8 12:34:59 2018 -0600] (18 hours ago)
|\  
| * fb8dc4a "Tpetra,Belos: Fix #3235, & add unit test" <mhoemme@sandia.gov> [Tue Aug 7 12:54:26 2018 -0600] (2 days ago)
* | f93a0cb "Consolidate disables for all builds on 'mutrino' (#3183) (#3251)" <rabartl@sandia.gov> [Wed Aug 8 13:56:51 2018 -0400] (19 hours ago)
* |   74d195d "Merge pull request #3243 from jmgate/ifpack2-remove-unused-parameter-warnings" <mhoemmen@users.noreply.github.com> [Wed Aug 8 09:40:37 2018 -0600] (21 hours ago)
|\ \  
| * | b63d73b "Ifpack2:  Remove Unused Parameter Warnings" <jmgate@sandia.gov> [Mon Aug 6 13:32:31 2018 -0600] (2 days ago)
| |/  
* | f9de01a "Piro: use template keyword when calling memeber function (fix issue #3157)" <mperego@proxima.ca.sandia.gov> [Tue Aug 7 09:33:20 2018 -0700] (22 hours ago)
* | 09e26b1 "ROL: fix typo in ThyraProductME_Objective constructor, '==' --> '='." <mperego@proxima.ca.sandia.gov> [Tue Aug 7 09:31:50 2018 -0700] (22 hours ago)
* | e16895a "Disable test Piro_MatrixFreeDecorator_UnitTests_MPI_4 in Intel builds (#2474) (#3252)" <rabartl@sandia.gov> [Tue Aug 7 22:17:52 2018 -0400] (34 hours ago)
...

So here we see the problem. These two graphs are divergent. TRILINOS_TARGET_SHA= 45adc90 must be the tip of the 'develop' branch that will be merged intoTRILINOS_SOURCE_SHA` = c96c475 in order to do testing. These two branches share the common commit e16895a:

e16895a "Disable test Piro_MatrixFreeDecorator_UnitTests_MPI_4 in Intel builds (#2474) (#3252)" <rabartl@sandia.gov> [Tue Aug 7 22:17:52 2018 -0400] (34 hours ago)

The way to fix this is to find the commit on the topic branch that is already on 'develop' The way you do that is with:

TRILINOS_MERGE_BASE_SHA1=`git merge-base ${TRILINOS_SOURCE_SHA} ${TRILINOS_TARGET_SHA}`

For this case, this returns:

$ git merge-base 45adc90 c96c475
e16895a3f7a7d7213e47a1fc6ce7d5dd71043b25

That is the correct SHA1. Using that SHA1, we get the right set of file changes:

$ ./Trilinos/commonTools/framework/get-changed-trilinos-packages.sh \
  e16895 c96c475 enablePackages.cmake \
  &> get-changed-trilinos-packages.out

$ cat changed-files.txt 
cmake/ctest/drivers/atdm/ride/drivers/Trilinos-atdm-white-ride-cuda-9.2-debug-panzer.sh
cmake/ctest/drivers/atdm/ride/drivers/Trilinos-atdm-white-ride-cuda-9.2-debug-pt.sh
cmake/ctest/drivers/atdm/ride/drivers/Trilinos-atdm-white-ride-cuda-9.2-debug.sh
cmake/ctest/drivers/atdm/ride/drivers/Trilinos-atdm-white-ride-cuda-9.2-opt-Pascal60.sh
cmake/ctest/drivers/atdm/ride/drivers/Trilinos-atdm-white-ride-cuda-9.2-opt-panzer-Pascal60.sh
cmake/ctest/drivers/atdm/ride/drivers/Trilinos-atdm-white-ride-cuda-9.2-opt-panzer.sh
cmake/ctest/drivers/atdm/ride/drivers/Trilinos-atdm-white-ride-cuda-9.2-opt.sh
cmake/ctest/drivers/atdm/ride/drivers/Trilinos-atdm-white-ride-cuda-debug-all-at-once.sh
cmake/ctest/drivers/atdm/ride/drivers/Trilinos-atdm-white-ride-cuda-debug-panzer.sh
cmake/ctest/drivers/atdm/ride/drivers/Trilinos-atdm-white-ride-cuda-debug-pt-all-at-once.sh
cmake/ctest/drivers/atdm/ride/drivers/Trilinos-atdm-white-ride-cuda-debug.sh
cmake/ctest/drivers/atdm/ride/drivers/Trilinos-atdm-white-ride-cuda-opt-Pascal60.sh
cmake/ctest/drivers/atdm/ride/drivers/Trilinos-atdm-white-ride-cuda-opt-panzer-Pascal60.sh
cmake/ctest/drivers/atdm/ride/drivers/Trilinos-atdm-white-ride-cuda-opt-panzer.sh
cmake/ctest/drivers/atdm/ride/drivers/Trilinos-atdm-white-ride-cuda-opt.sh
cmake/std/atdm/ride/all_supported_builds.sh
cmake/std/atdm/ride/environment.sh
cmake/std/atdm/ride/tweaks/CUDA-9.2-DEBUG-CUDA-POWER8-KEPLER37.cmake
cmake/std/atdm/ride/tweaks/CUDA-9.2-RELEASE-CUDA-POWER8-KEPLER37.cmake
cmake/std/atdm/ride/tweaks/CUDA-DEBUG-CUDA-POWER8-KEPLER37.cmake
cmake/std/atdm/ride/tweaks/CUDA-RELEASE-CUDA-POWER8-KEPLER37.cmake
cmake/std/atdm/ride/tweaks/GNU-DEBUG-OPENMP-POWER8-KEPLER37.cmake
cmake/std/atdm/ride/tweaks/GNU-DEBUG-OPENMP-POWER8.cmake
cmake/std/atdm/ride/tweaks/GNU-RELEASE-OPENMP-POWER8-KEPLER37.cmake
cmake/std/atdm/ride/tweaks/GNU-RELEASE-OPENMP-POWER8.cmake
cmake/std/atdm/utils/set_build_options.sh

That is the correct set of changed files.

I will update the branch in PR #3258 to fix this.

bartlettroscoe commented 6 years ago

With PR #3258 merged (thank @jwillenbring), this issue should now be resolved. Now we just need to watch the next few PRs over the coming days and make sure that the right packages are being enabled and tested.

bartlettroscoe commented 6 years ago

All seems good as noted in closed #3133.

Closing as complete.