Open niermann999 opened 4 months ago
What is the gcc & cuda version?
gcc/13.2 cuda/12.4
Curious. With GCC 11.4 + CUDA 12.4 it does work happily on my laptop. :thinking: Will try with GCC 13 in a little bit...
Never mind. Once I actually do the build in debug mode, I do get the same. With both GCC 11.4 and 13.1.
What I see is:
[ RUN ] algebra_plugins/test_cuda_basics/cuda_eigen_eigen<float>.transform3
CUDA Exception: Warp Illegal Instruction
The exception was triggered at PC 0x0 (Transform.h:1405)
Thread 1 "algebra_test_ei" received signal CUDA_EXCEPTION_4, Warp Illegal Instruction.
[Switching focus to CUDA kernel 0, grid 15, block (0,0,0), thread (128,0,0), device 0, sm 0, warp 6, lane 0]
0x0000000000000010 in Eigen::internal::check_static_allocation_size<double, 9> ()
at /home/krasznaa/ATLAS/projects/algebra/algebra-plugins/out/build/default-x86-64/_deps/eigen3-src/Eigen/src/Geometry/Transform.h:1405
1405 static EIGEN_DEVICE_FUNC EIGEN_STRONG_INLINE ResultType run(const TransformType& T, const MatrixType& other)
(cuda-gdb) bt
#0 0x0000000000000010 in Eigen::internal::check_static_allocation_size<double, 9> ()
at /home/krasznaa/ATLAS/projects/algebra/algebra-plugins/out/build/default-x86-64/_deps/eigen3-src/Eigen/src/Geometry/Transform.h:1405
#1 0x00007fffa7a1c950 in Eigen::Transform<float, 3, 2, 0>::Transform<Eigen::CwiseNullaryOp<Eigen::internal::scalar_identity_op<float>, Eigen::Matrix<float, 4, 4, 0, 4, 4> > > (this=0x7fffe3fff850, other=...)
at /home/krasznaa/ATLAS/projects/algebra/algebra-plugins/out/build/default-x86-64/_deps/eigen3-src/Eigen/src/Geometry/Transform.h:292
#2 0x00007fffa7a1bbb0 in Eigen::Transform<float, 3, 2, 0>::Identity ()
at /home/krasznaa/ATLAS/projects/algebra/algebra-plugins/out/build/default-x86-64/_deps/eigen3-src/Eigen/src/Geometry/Transform.h:535
#3 0x00007fffa7a2ca90 in algebra::eigen::math::transform3<float, algebra::eigen::matrix::actor<float> >::transform3 (
this=0x7fffe3fff850, t=..., x=..., y=..., z=..., get_inverse=true)
at /home/krasznaa/ATLAS/projects/algebra/algebra-plugins/math/eigen/include/algebra/math/impl/eigen_transform3.hpp:80
#4 0x00007fffa7a2bab0 in algebra::eigen::math::transform3<float, algebra::eigen::matrix::actor<float> >::transform3 (
this=0x7fffe3fff740, t=..., z=..., x=..., get_inverse=255)
at /home/krasznaa/ATLAS/projects/algebra/algebra-plugins/math/eigen/include/algebra/math/impl/eigen_transform3.hpp:118
#5 0x00007fffa77d6760 in test_device_basics<test_types<float, algebra::eigen::array<float, 2>, algebra::eigen::array<float, 3>, algebra::eigen::array<float, 2>, algebra::eigen::array<float, 3>, algebra::eigen::math::transform3<float, algebra::eigen::matrix::actor<float> >, int, algebra::eigen::matrix_type, algebra::eigen::matrix::actor<float> > >::transform3_ops (
this=0x7fffe3fffd40, t1=0x7fff00000000, t2=0x7fffe3fffad0, t3=0x7fffe3fffae8,
a=0x7fffa77d6760 <test_device_basics<test_types<float, algebra::eigen::array<float, 2>, algebra::eigen::array<float, 3>, algebra::eigen::array<float, 2>, algebra::eigen::array<float, 3>, algebra::eigen::math::transform3<float, algebra::eigen::matrix::actor<float> >, int, algebra::eigen::matrix_type, algebra::eigen::matrix::actor<float> > >::transform3_ops(algebra::eigen::array<float, 3>, algebra::eigen::array<float, 3>, algebra::eigen::array<float, 3>, algebra::eigen::array<float, 3>, algebra::eigen::array<float, 3>) const+1632>, b=0x7fffe3fffb18)
at /home/krasznaa/ATLAS/projects/algebra/algebra-plugins/tests/common/test_device_basics.hpp:207
#6 0x00007fffa77d5530 in transform3_ops_functor<test_types<float, algebra::eigen::array<float, 2>, algebra::eigen::array<float, 3>, algebra::eigen::array<float, 2>, algebra::eigen::array<float, 3>, algebra::eigen::math::transform3<float, algebra::eigen::matrix::actor<float> >, int, algebra::eigen::matrix_type, algebra::eigen::matrix::actor<float> > >::operator() (
this=0x7fffe3fffa40, i=140735743645696, t1=..., t2=..., t3=..., a=..., b=..., output=...)
at /home/krasznaa/ATLAS/projects/algebra/algebra-plugins/tests/accelerator/common/test_basics_functors.hpp:129
#7 0x00007fffa77d3070 in (anonymous namespace)::cudaTestKernel<transform3_ops_functor<test_types<float, algebra::eigen::array<float, 2>, algebra::eigen::array<float, 3>, algebra::eigen::array<float, 2>, algebra::eigen::array<float, 3>, algebra::eigen::math::transform3<float, algebra::eigen::matrix::actor<float> >, int, algebra::eigen::matrix_type, algebra::eigen::matrix::actor<float>--Type <RET> for more, q to quit, c to continue without paging--c
> >, vecmem::data::vector_view<algebra::eigen::array<float, 3> >, vecmem::data::vector_view<algebra::eigen::array<float, 3> >, vecmem::data::vector_view<algebra::eigen::array<float, 3> >, vecmem::data::vector_view<algebra::eigen::array<float, 3> >, vecmem::data::vector_view<algebra::eigen::array<float, 3> >, vecmem::data::vector_view<float> ><<<(20,1,1),(256,1,1)>>> (
arraySizes=5000, args=..., args=..., args=..., args=..., args=..., args=...)
at /home/krasznaa/ATLAS/projects/algebra/algebra-plugins/tests/accelerator/cuda/common/execute_cuda_test.cuh:28
(cuda-gdb)
In case somebody manages to debug it before me. :wink:
The fact that the PC is 0x0
is rather worrying. :sweat_smile:
As the backtrace says, the crash is triggered by this line:
At which point it's hard to argue that this wouldn't be coming from some internal Eigen issue. :thinking: Having quickly looked at the code, I don't really understand what the issue is. Why the final call itself, would cause an error.
Unfortunately I won't be able to debug this any further at the moment. So somebody could possibly look into using a newer/different version of Eigen, and see what happens with that. Otherwise, maybe we just don't use Eigen on GPUs in Debug mode for now... :thinking:
Sus.
Error message:
The Release build is fine