Closed PhilMiller closed 6 months ago
After preprocessing the file to a self-contained test case with -E
, I can reproduce the failure with just icpc -xCORE-AVX2 -g -fPIE -std=c++14 -c normals.cpp
. When I drop the detailed arguments, the compiler succeeds. so that should ease bug isolation a bit.
Just icpc -xCORE-AVX2 -c normals.cpp
fails, so there's something wrong in optimization or code generation
icpc -O0 -xCORE-AVX2 -c normals.cpp
succeeds
The compiler succeeds with -O1
and fails with -O2
I'll see if I can reproduce this on a faster, less shared machine, so I can start running CReduce on it.
At the very end of the file, if I comment as follows compilation still fails:
//template class normals_test2d_UnitTest<ResidualType>; normals_test2d_UnitTest<ResidualType> instance_normals_ResidualType_test2d_UnitTest("ResidualType");
template class normals_test2d_UnitTest<JacobianType>; normals_test2d_UnitTest<JacobianType> instance_normals_JacobianType_test2d_UnitTest("JacobianType");
If I comment the other instantiation instead, compilation succeeds
There's at least some target specificity to the failure, maybe in vectorization, because if I run -O2
without -xCORE-AVX2
then it passes. I suspect it's not an issue in code generation, because failures take ~18 s, while successes take ~48 s.
I can remove the explicit template specialization, and just leave the subsequent instantiation, and it still fails
normals_test2d_UnitTest<JacobianType> instance_normals_JacobianType_test2d_UnitTest("JacobianType");
Inlining the JacobianType
typedef preserves the failure.
Further reduction reports to come.
There's some environment variable dependence, since just running the compiler directly reports syntax errors, while loading the module and then running it hits the internal error.
In response to queries from @rppawlo about memory usage, I ran under /usr/bin/time
, and found a maximum resident set size of 1.2 GB, which should pose no problem on the system with 128 GB total. I also didn't see the virtual memory size blowing up in top
.
I've cleared up the environment dependence by passing the relevant GCC compatibility flag, so that it doesn't pick up the ancient 4.8.5 from the bare PATH
.
/projects/global/toss3/compilers/intel/intel_2021/oneapi/compiler/2021.3.0/linux/bin/intel64/icpc -V -O2 -xCORE-AVX2 -gxx-name=/usr/tce/packages/gcc/gcc-6.1.0/bin/g++ -c cleaned.cpp
@trilinos/panzer
The full set of things I've changed to drive the build to completion on Intel 2021.3.0 can be found here: https://github.com/PhilMiller/Trilinos/tree/pm/intel19
The code as modified to work around the compiler failure in that one test passes the rest of the suite run by ctest
With CReduce, my test case is down to 'just' 2 MB, and still falling pretty rapidly. Hopefully, I'll have something reasonable by tomorrow or later in the week.
@PhilMiller - can you try using the openmpi/4.0.5/intel-oneapi/2021.4.0
module on the Blake test bed to see if this is an issue in Intel 2021.4 OneAPI as well please (I think this will help with the bug report).
For posterity, the test case failed with the same error on 2021.4.0
I got the same error (internal error: 101003_1112). For me the Trilinos build failed when compiling kokkos/core/unit_test/serial/TestSerial_LocalDeepCopy.cpp. All the Intel compiler versions greater than 19.1 (tested till v 2022.0.2) produces the same error when compiled with optimization flag greater than O2. The problem doesn't occur for Intel versions lower than and equal to 19.0
It seems the error occurs with files having too many templates.
This issue has had no activity for 365 days and is marked for closure. It will be closed after an additional 30 days of inactivity.
If you would like to keep this issue open please add a comment and/or remove the MARKED_FOR_CLOSURE
label.
If this issue should be kept open even with no activity beyond the time limits you can add the label DO_NOT_AUTOCLOSE
.
If it is ok for this issue to be closed, feel free to go ahead and close it. Please do not add any comments or change any labels or otherwise touch this issue unless your intention is to reset the inactivity counter for an additional year.
Nope, Intel's classic compiler is still fragile around this.
This issue has had no activity for 365 days and is marked for closure. It will be closed after an additional 30 days of inactivity.
If you would like to keep this issue open please add a comment and/or remove the MARKED_FOR_CLOSURE
label.
If this issue should be kept open even with no activity beyond the time limits you can add the label DO_NOT_AUTOCLOSE
.
If it is ok for this issue to be closed, feel free to go ahead and close it. Please do not add any comments or change any labels or otherwise touch this issue unless your intention is to reset the inactivity counter for an additional year.
This issue was closed due to inactivity for 395 days.
Bug Report
@trilinos/panzer @trilinos/framework
Description
I'm attempting to compiler Trilinos as configured for EMPIRE using the Intel 2021.3.0 compilers. The build fails thusly:
Steps to Reproduce
--------------------------[ Script Execution Summary ]-------------------------
Ran the following on eclipse-login7:
build_trilinos.py \ --compiler intel \ --build-type debug \ --lib-type shared \ --src-dir /ascldap/users/pbmille/repos/ \ --ref develop \ --save-replay-file /projects/empire/users/pbmille/build/trilinos/.build_trilinos_replay_2022-01-17_19.53.01.603381 \ --build-system ninja \ --stage configure
Trilinos configured in:
/projects/empire/users/pbmille/build/trilinos/build/INTEL-18.0.2_OPENMPI-4.0.1-DEBUG-SERIAL-SHARED
with:
cmake \ -D Trilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake,cmake/std/atdm/apps/empire/EMPIRETrilinosEnables.cmake \ -D Panzer_ENABLE_EXAMPLES:BOOL=ON \ -D Panzer_ENABLE_TESTS:BOOL=ON \ -D TPL_ENABLE_gtest=OFF \ -D CMAKE_INSTALL_PREFIX=/projects/empire/users/pbmille/build/trilinos/install/INTEL-18.0.2_OPENMPI-4.0.1-DEBUG-SERIAL-SHARED \ -D PanzerAdaptersSTK_CurlLaplacianExample-ConvTest-Quad-Order-4_DISABLE:BOOL=ON \ -D PanzerAdaptersSTK_MixedPoissonExample-ConvTest-Hex-Order-3_DISABLE:BOOL=ON \ -D TPL_ENABLE_CUSPARSE:BOOL=OFF \ -G Ninja \ /home/pbmille/repos/Trilinos
Configured Trilinos with the following commit:
| Repository | SHA1 | |------------|---------| | Trilinos | f7fdd76 |
Load the environment used by this script with:
source /projects/empire/users/pbmille/build/trilinos/build/INTEL-18.0.2_OPENMPI-4.0.1-DEBUG-SERIAL-SHARED/load_matching_env.sh
Total time: 0h 2m 32.17s
For complete details, see:
/projects/empire/users/pbmille/build/trilinos/log/2022-01-17_19.53.01.603636_7f7os6zq/Build_Trilinos.html
------------------------[ End Script Execution Summary ]-----------------------