Closed njzjz closed 3 years ago
Summary When compiling v2.0.0.b0 with cudatoolkit <=10.0, several warnings related to cub appear. I am not sure if it is fine to ignore the warning.
Deepmd-kit version, installation way, input file, running commands, error log, etc.
v2.0.0.b0, cudatoolkit 9.2/10.0
The warning doesn't appear in cudatoolkit 10.1 builds.
Steps to Reproduce
pip install -v .
Further Information, Files, and Links
See https://dev.azure.com/deepmd-kit-recipes/dfe7535b-bc98-4288-9a49-be8914966e7a/_apis/build/builds/410/logs/112
2021-05-20T02:07:03.7111638Z [ 7%] Building NVCC (Device) object lib/src/cuda/CMakeFiles/deepmd_op_cuda.dir/deepmd_op_cuda_generated_prod_env_mat.cu.o 2021-05-20T02:07:06.2941810Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code 2021-05-20T02:07:06.2942851Z detected during: 2021-05-20T02:07:06.2943662Z instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:06.2944431Z (124): here 2021-05-20T02:07:06.2945083Z instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:06.2946297Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here 2021-05-20T02:07:06.2948116Z instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=750, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]" 2021-05-20T02:07:06.2950131Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here 2021-05-20T02:07:06.2952449Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=750]" 2021-05-20T02:07:06.2955336Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here 2021-05-20T02:07:06.2957106Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=750, DESCENDING=0, KEYS_ONLY=1]" 2021-05-20T02:07:06.2959342Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here 2021-05-20T02:07:06.2960919Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=750]" 2021-05-20T02:07:06.2962697Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here 2021-05-20T02:07:06.2963492Z instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]" 2021-05-20T02:07:06.2964515Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here 2021-05-20T02:07:06.2965382Z instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]" 2021-05-20T02:07:06.2966632Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here 2021-05-20T02:07:06.2967639Z instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]" 2021-05-20T02:07:06.2968891Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here 2021-05-20T02:07:06.2969289Z 2021-05-20T02:07:08.0130612Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code 2021-05-20T02:07:08.0131580Z detected during: 2021-05-20T02:07:08.0132580Z instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:08.0133399Z (124): here 2021-05-20T02:07:08.0134078Z instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:08.0135277Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here 2021-05-20T02:07:08.0137357Z instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=false, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=300, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]" 2021-05-20T02:07:08.0139851Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here 2021-05-20T02:07:08.0142425Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=false, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=300]" 2021-05-20T02:07:08.0145424Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here 2021-05-20T02:07:08.0147298Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=false, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=300, DESCENDING=0, KEYS_ONLY=1]" 2021-05-20T02:07:08.0149508Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here 2021-05-20T02:07:08.0151136Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=false, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=300]" 2021-05-20T02:07:08.0153728Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here 2021-05-20T02:07:08.0154393Z instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]" 2021-05-20T02:07:08.0155261Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here 2021-05-20T02:07:08.0155976Z instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]" 2021-05-20T02:07:08.0156891Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here 2021-05-20T02:07:08.0157729Z instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]" 2021-05-20T02:07:08.0158975Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here 2021-05-20T02:07:08.0159212Z 2021-05-20T02:07:08.6276192Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code 2021-05-20T02:07:08.6276846Z detected during: 2021-05-20T02:07:08.6277479Z instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:08.6278082Z (124): here 2021-05-20T02:07:08.6278565Z instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:08.6279498Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here 2021-05-20T02:07:08.6281068Z instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=350, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]" 2021-05-20T02:07:08.6283109Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here 2021-05-20T02:07:08.6285242Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=350]" 2021-05-20T02:07:08.6287633Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here 2021-05-20T02:07:08.6289206Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=350, DESCENDING=0, KEYS_ONLY=1]" 2021-05-20T02:07:08.6291021Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here 2021-05-20T02:07:08.6292367Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=350]" 2021-05-20T02:07:08.6294107Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here 2021-05-20T02:07:08.6294711Z instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]" 2021-05-20T02:07:08.6295509Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here 2021-05-20T02:07:08.6296192Z instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]" 2021-05-20T02:07:08.6297066Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here 2021-05-20T02:07:08.6297867Z instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]" 2021-05-20T02:07:08.6298884Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here 2021-05-20T02:07:08.6299211Z 2021-05-20T02:07:09.9175172Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code 2021-05-20T02:07:09.9176486Z detected during: 2021-05-20T02:07:09.9177155Z instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:09.9177737Z (124): here 2021-05-20T02:07:09.9183265Z instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:09.9188317Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here 2021-05-20T02:07:09.9194777Z instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=370, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]" 2021-05-20T02:07:09.9201386Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here 2021-05-20T02:07:09.9207982Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=370]" 2021-05-20T02:07:09.9214950Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here 2021-05-20T02:07:09.9221079Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=370, DESCENDING=0, KEYS_ONLY=1]" 2021-05-20T02:07:09.9227770Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here 2021-05-20T02:07:09.9233574Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=370]" 2021-05-20T02:07:09.9239673Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here 2021-05-20T02:07:09.9244911Z instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]" 2021-05-20T02:07:09.9250504Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here 2021-05-20T02:07:09.9255431Z instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]" 2021-05-20T02:07:09.9261072Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here 2021-05-20T02:07:09.9267711Z instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]" 2021-05-20T02:07:09.9274091Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here 2021-05-20T02:07:09.9279266Z 2021-05-20T02:07:11.2104995Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code 2021-05-20T02:07:11.2111336Z detected during: 2021-05-20T02:07:11.2117617Z instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:11.2124027Z (124): here 2021-05-20T02:07:11.2130465Z instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:11.2137448Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here 2021-05-20T02:07:11.2145060Z instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=500, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]" 2021-05-20T02:07:11.2153089Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here 2021-05-20T02:07:11.2167268Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=500]" 2021-05-20T02:07:11.2175981Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here 2021-05-20T02:07:11.2187231Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=500, DESCENDING=0, KEYS_ONLY=1]" 2021-05-20T02:07:11.2189433Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here 2021-05-20T02:07:11.2203198Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=500]" 2021-05-20T02:07:11.2205365Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here 2021-05-20T02:07:11.2206193Z instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]" 2021-05-20T02:07:11.2207237Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here 2021-05-20T02:07:11.2208120Z instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]" 2021-05-20T02:07:11.2209198Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here 2021-05-20T02:07:11.2210189Z instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]" 2021-05-20T02:07:11.2211379Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here 2021-05-20T02:07:11.2211819Z 2021-05-20T02:07:12.5147432Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code 2021-05-20T02:07:12.5153880Z detected during: 2021-05-20T02:07:12.5160287Z instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:12.5166738Z (124): here 2021-05-20T02:07:12.5172865Z instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:12.5179945Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here 2021-05-20T02:07:12.5187470Z instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=520, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]" 2021-05-20T02:07:12.5195344Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here 2021-05-20T02:07:12.5203390Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=520]" 2021-05-20T02:07:12.5220139Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here 2021-05-20T02:07:12.5221782Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=520, DESCENDING=0, KEYS_ONLY=1]" 2021-05-20T02:07:12.5223633Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here 2021-05-20T02:07:12.5224985Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=520]" 2021-05-20T02:07:12.5226540Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here 2021-05-20T02:07:12.5227148Z instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]" 2021-05-20T02:07:12.5227940Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here 2021-05-20T02:07:12.5228623Z instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]" 2021-05-20T02:07:12.5229492Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here 2021-05-20T02:07:12.5230877Z instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]" 2021-05-20T02:07:12.5231988Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here 2021-05-20T02:07:12.5232386Z 2021-05-20T02:07:13.8594159Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code 2021-05-20T02:07:13.8600704Z detected during: 2021-05-20T02:07:13.8611185Z instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:13.8622873Z (124): here 2021-05-20T02:07:13.8629043Z instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:13.8635987Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here 2021-05-20T02:07:13.8643388Z instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=530, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]" 2021-05-20T02:07:13.8651403Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here 2021-05-20T02:07:13.8659583Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=530]" 2021-05-20T02:07:13.8667955Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here 2021-05-20T02:07:13.8675386Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=530, DESCENDING=0, KEYS_ONLY=1]" 2021-05-20T02:07:13.8701063Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here 2021-05-20T02:07:13.8702493Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=530]" 2021-05-20T02:07:13.8704068Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here 2021-05-20T02:07:13.8704854Z instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]" 2021-05-20T02:07:13.8705723Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here 2021-05-20T02:07:13.8706392Z instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]" 2021-05-20T02:07:13.8716542Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here 2021-05-20T02:07:13.8717368Z instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]" 2021-05-20T02:07:13.8718424Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here 2021-05-20T02:07:13.8718667Z 2021-05-20T02:07:15.1917801Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code 2021-05-20T02:07:15.1925206Z detected during: 2021-05-20T02:07:15.1931789Z instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:15.1938059Z (124): here 2021-05-20T02:07:15.1944662Z instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:15.1951403Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here 2021-05-20T02:07:15.1958935Z instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=600, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]" 2021-05-20T02:07:15.1966669Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here 2021-05-20T02:07:15.1974821Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=600]" 2021-05-20T02:07:15.1983258Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here 2021-05-20T02:07:15.2008395Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=600, DESCENDING=0, KEYS_ONLY=1]" 2021-05-20T02:07:15.2010617Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here 2021-05-20T02:07:15.2011979Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=600]" 2021-05-20T02:07:15.2013537Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here 2021-05-20T02:07:15.2014151Z instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]" 2021-05-20T02:07:15.2015111Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here 2021-05-20T02:07:15.2015803Z instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]" 2021-05-20T02:07:15.2016694Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here 2021-05-20T02:07:15.2017473Z instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]" 2021-05-20T02:07:15.2018477Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here 2021-05-20T02:07:15.2018704Z 2021-05-20T02:07:16.5418795Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code 2021-05-20T02:07:16.5425555Z detected during: 2021-05-20T02:07:16.5432062Z instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:16.5438263Z (124): here 2021-05-20T02:07:16.5444377Z instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:16.5451005Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here 2021-05-20T02:07:16.5458620Z instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=610, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]" 2021-05-20T02:07:16.5467070Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here 2021-05-20T02:07:16.5475098Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=610]" 2021-05-20T02:07:16.5484541Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here 2021-05-20T02:07:16.5491950Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=610, DESCENDING=0, KEYS_ONLY=1]" 2021-05-20T02:07:16.5500875Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here 2021-05-20T02:07:16.5537829Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=610]" 2021-05-20T02:07:16.5539734Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here 2021-05-20T02:07:16.5540368Z instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]" 2021-05-20T02:07:16.5541208Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here 2021-05-20T02:07:16.5541892Z instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]" 2021-05-20T02:07:16.5542773Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here 2021-05-20T02:07:16.5543566Z instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]" 2021-05-20T02:07:16.5544571Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here 2021-05-20T02:07:16.5544800Z 2021-05-20T02:07:17.8837000Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code 2021-05-20T02:07:17.8837789Z detected during: 2021-05-20T02:07:17.8838506Z instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:17.8839108Z (124): here 2021-05-20T02:07:17.8839588Z instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]" 2021-05-20T02:07:17.8840529Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here 2021-05-20T02:07:17.8842069Z instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=700, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]" 2021-05-20T02:07:17.8844155Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here 2021-05-20T02:07:17.8846291Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=700]" 2021-05-20T02:07:17.8848832Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here 2021-05-20T02:07:17.8850383Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=700, DESCENDING=0, KEYS_ONLY=1]" 2021-05-20T02:07:17.8852196Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here 2021-05-20T02:07:17.8853558Z instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=700]" 2021-05-20T02:07:17.8855091Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here 2021-05-20T02:07:17.8855664Z instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]" 2021-05-20T02:07:17.8856481Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here 2021-05-20T02:07:17.8857160Z instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]" 2021-05-20T02:07:17.8858025Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here 2021-05-20T02:07:17.8858818Z instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]" 2021-05-20T02:07:17.8860049Z /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here 2021-05-20T02:07:17.8860302Z
I am going to remove the support for CUDA 9.2 and 10.0, as they don't support GCC 8 or later.
Summary When compiling v2.0.0.b0 with cudatoolkit <=10.0, several warnings related to cub appear. I am not sure if it is fine to ignore the warning.
Deepmd-kit version, installation way, input file, running commands, error log, etc.
v2.0.0.b0, cudatoolkit 9.2/10.0
The warning doesn't appear in cudatoolkit 10.1 builds.
Steps to Reproduce
Further Information, Files, and Links
See https://dev.azure.com/deepmd-kit-recipes/dfe7535b-bc98-4288-9a49-be8914966e7a/_apis/build/builds/410/logs/112