ROCm / rocPRIM

ROCm Parallel Primitives
https://rocm.docs.amd.com/projects/rocPRIM/
MIT License
162 stars 69 forks source link

Failing tests with libstdc++ assertions due to unsigned overflow #570

Closed AngryLoki closed 3 months ago

AngryLoki commented 5 months ago

Describe the bug Hi, while testing with hardened libstdc++ (compiled with -D_GLIBCXX_ASSERTIONS), many tests rocPRIM failed with traceback like:

libstdc++.so.6!std::__glibcxx_assert_fail(char const*, int, char const*, char const*) (Unknown Source:0)
std::uniform_int_distribution<unsigned int>::param_type::param_type() (\usr\lib\gcc\x86_64-pc-linux-gnu\13\include\g++-v13\bits\uniform_int_dist.h:108)
std::uniform_int_distribution<unsigned int>::uniform_int_distribution() (\usr\lib\gcc\x86_64-pc-linux-gnu\13\include\g++-v13\bits\uniform_int_dist.h:145)
test_utils::generate_random_data_n<__gnu_cxx::__normal_iterator<unsigned char*, std::vector<unsigned char, std::allocator<unsigned char> > >, int, int, std::linear_congruential_engine<unsigned long, 16807ul, 0ul, 2147483647ul> >() (\var\tmp\portage\sci-libs\rocPRIM-6.1.1\work\rocPRIM-rocm-6.1.1\test\rocprim\test_utils_data_generation.hpp:262)
test_utils::get_random_data<unsigned char, int, int>(seed_type seed_value) (\var\tmp\portage\sci-libs\rocPRIM-6.1.1\work\rocPRIM-rocm-6.1.1\test\rocprim\test_utils_data_generation.hpp:354)
RocprimBlockLoadStoreClassTestsFirstPart_LoadStoreClass_Test<class_params<unsigned char, (rocprim::block_load_method)2, (rocprim::block_store_method)2, 512u, 1u> >::TestBody(class RocprimBlockLoadStoreClassTestsFirstPart_LoadStoreClass_Test<class_params<unsigned char, (rocprim::block_load_method)2, (rocprim::block_store_method)2, 512u, 1u> > * this) (\var\tmp\portage\sci-libs\rocPRIM-6.1.1\work\rocPRIM-rocm-6.1.1\test\rocprim\test_block_load_store.hpp:53)

where starting point is test_utils::get_random_data<Type>(size, -100, 100, seed_value); and ending point is __glibcxx_assert(_M_a <= _M_b), where _M_a = 4294967196 and _M_b = 100.

I. e. tests try to build uniform_int_distribution<unsigned int> between -100 and 100, which results in implicit conversion, which results in undefined behavior.

Affected tests are:

Some of them use unsigned short, some unsigned long long and so on. Replacing negative starting point with non-negative helps, but maybe you can provide a better solution?

Environment

Naraenda commented 5 months ago

We're tracking this internally. I'll try to get this in for 6.3, but it might slack to a patch bump. Idem for https://github.com/ROCm/rocThrust/issues/420.