deepmodeling / deepmd-kit

A deep learning package for many-body potential energy representation and molecular dynamics
https://docs.deepmodeling.com/projects/deepmd/
GNU Lesser General Public License v3.0
1.45k stars 499 forks source link

[BUG] cub related warning #647

Closed njzjz closed 3 years ago

njzjz commented 3 years ago

Summary When compiling v2.0.0.b0 with cudatoolkit <=10.0, several warnings related to cub appear. I am not sure if it is fine to ignore the warning.

Deepmd-kit version, installation way, input file, running commands, error log, etc.

v2.0.0.b0, cudatoolkit 9.2/10.0

The warning doesn't appear in cudatoolkit 10.1 builds.

Steps to Reproduce

pip install -v .

Further Information, Files, and Links

See https://dev.azure.com/deepmd-kit-recipes/dfe7535b-bc98-4288-9a49-be8914966e7a/_apis/build/builds/410/logs/112

2021-05-20T02:07:03.7111638Z   [  7%] Building NVCC (Device) object lib/src/cuda/CMakeFiles/deepmd_op_cuda.dir/deepmd_op_cuda_generated_prod_env_mat.cu.o
2021-05-20T02:07:06.2941810Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code
2021-05-20T02:07:06.2942851Z             detected during:
2021-05-20T02:07:06.2943662Z               instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:06.2944431Z   (124): here
2021-05-20T02:07:06.2945083Z               instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:06.2946297Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here
2021-05-20T02:07:06.2948116Z               instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=750, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]"
2021-05-20T02:07:06.2950131Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here
2021-05-20T02:07:06.2952449Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=750]"
2021-05-20T02:07:06.2955336Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here
2021-05-20T02:07:06.2957106Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=750, DESCENDING=0, KEYS_ONLY=1]"
2021-05-20T02:07:06.2959342Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here
2021-05-20T02:07:06.2960919Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=750]"
2021-05-20T02:07:06.2962697Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here
2021-05-20T02:07:06.2963492Z               instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]"
2021-05-20T02:07:06.2964515Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here
2021-05-20T02:07:06.2965382Z               instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]"
2021-05-20T02:07:06.2966632Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here
2021-05-20T02:07:06.2967639Z               instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]"
2021-05-20T02:07:06.2968891Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here
2021-05-20T02:07:06.2969289Z 
2021-05-20T02:07:08.0130612Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code
2021-05-20T02:07:08.0131580Z             detected during:
2021-05-20T02:07:08.0132580Z               instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:08.0133399Z   (124): here
2021-05-20T02:07:08.0134078Z               instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:08.0135277Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here
2021-05-20T02:07:08.0137357Z               instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=false, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=300, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]"
2021-05-20T02:07:08.0139851Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here
2021-05-20T02:07:08.0142425Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=false, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=300]"
2021-05-20T02:07:08.0145424Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here
2021-05-20T02:07:08.0147298Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=false, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=300, DESCENDING=0, KEYS_ONLY=1]"
2021-05-20T02:07:08.0149508Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here
2021-05-20T02:07:08.0151136Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=false, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=300]"
2021-05-20T02:07:08.0153728Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here
2021-05-20T02:07:08.0154393Z               instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]"
2021-05-20T02:07:08.0155261Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here
2021-05-20T02:07:08.0155976Z               instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]"
2021-05-20T02:07:08.0156891Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here
2021-05-20T02:07:08.0157729Z               instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]"
2021-05-20T02:07:08.0158975Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here
2021-05-20T02:07:08.0159212Z 
2021-05-20T02:07:08.6276192Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code
2021-05-20T02:07:08.6276846Z             detected during:
2021-05-20T02:07:08.6277479Z               instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:08.6278082Z   (124): here
2021-05-20T02:07:08.6278565Z               instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:08.6279498Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here
2021-05-20T02:07:08.6281068Z               instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=350, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]"
2021-05-20T02:07:08.6283109Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here
2021-05-20T02:07:08.6285242Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=350]"
2021-05-20T02:07:08.6287633Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here
2021-05-20T02:07:08.6289206Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=350, DESCENDING=0, KEYS_ONLY=1]"
2021-05-20T02:07:08.6291021Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here
2021-05-20T02:07:08.6292367Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=350]"
2021-05-20T02:07:08.6294107Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here
2021-05-20T02:07:08.6294711Z               instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]"
2021-05-20T02:07:08.6295509Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here
2021-05-20T02:07:08.6296192Z               instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]"
2021-05-20T02:07:08.6297066Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here
2021-05-20T02:07:08.6297867Z               instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]"
2021-05-20T02:07:08.6298884Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here
2021-05-20T02:07:08.6299211Z 
2021-05-20T02:07:09.9175172Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code
2021-05-20T02:07:09.9176486Z             detected during:
2021-05-20T02:07:09.9177155Z               instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:09.9177737Z   (124): here
2021-05-20T02:07:09.9183265Z               instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:09.9188317Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here
2021-05-20T02:07:09.9194777Z               instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=370, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]"
2021-05-20T02:07:09.9201386Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here
2021-05-20T02:07:09.9207982Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=370]"
2021-05-20T02:07:09.9214950Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here
2021-05-20T02:07:09.9221079Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=370, DESCENDING=0, KEYS_ONLY=1]"
2021-05-20T02:07:09.9227770Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here
2021-05-20T02:07:09.9233574Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=370]"
2021-05-20T02:07:09.9239673Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here
2021-05-20T02:07:09.9244911Z               instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]"
2021-05-20T02:07:09.9250504Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here
2021-05-20T02:07:09.9255431Z               instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]"
2021-05-20T02:07:09.9261072Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here
2021-05-20T02:07:09.9267711Z               instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]"
2021-05-20T02:07:09.9274091Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here
2021-05-20T02:07:09.9279266Z 
2021-05-20T02:07:11.2104995Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code
2021-05-20T02:07:11.2111336Z             detected during:
2021-05-20T02:07:11.2117617Z               instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:11.2124027Z   (124): here
2021-05-20T02:07:11.2130465Z               instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:11.2137448Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here
2021-05-20T02:07:11.2145060Z               instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=500, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]"
2021-05-20T02:07:11.2153089Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here
2021-05-20T02:07:11.2167268Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=500]"
2021-05-20T02:07:11.2175981Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here
2021-05-20T02:07:11.2187231Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=500, DESCENDING=0, KEYS_ONLY=1]"
2021-05-20T02:07:11.2189433Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here
2021-05-20T02:07:11.2203198Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=500]"
2021-05-20T02:07:11.2205365Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here
2021-05-20T02:07:11.2206193Z               instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]"
2021-05-20T02:07:11.2207237Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here
2021-05-20T02:07:11.2208120Z               instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]"
2021-05-20T02:07:11.2209198Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here
2021-05-20T02:07:11.2210189Z               instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]"
2021-05-20T02:07:11.2211379Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here
2021-05-20T02:07:11.2211819Z 
2021-05-20T02:07:12.5147432Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code
2021-05-20T02:07:12.5153880Z             detected during:
2021-05-20T02:07:12.5160287Z               instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:12.5166738Z   (124): here
2021-05-20T02:07:12.5172865Z               instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:12.5179945Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here
2021-05-20T02:07:12.5187470Z               instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=520, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]"
2021-05-20T02:07:12.5195344Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here
2021-05-20T02:07:12.5203390Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=520]"
2021-05-20T02:07:12.5220139Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here
2021-05-20T02:07:12.5221782Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=520, DESCENDING=0, KEYS_ONLY=1]"
2021-05-20T02:07:12.5223633Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here
2021-05-20T02:07:12.5224985Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=520]"
2021-05-20T02:07:12.5226540Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here
2021-05-20T02:07:12.5227148Z               instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]"
2021-05-20T02:07:12.5227940Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here
2021-05-20T02:07:12.5228623Z               instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]"
2021-05-20T02:07:12.5229492Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here
2021-05-20T02:07:12.5230877Z               instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]"
2021-05-20T02:07:12.5231988Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here
2021-05-20T02:07:12.5232386Z 
2021-05-20T02:07:13.8594159Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code
2021-05-20T02:07:13.8600704Z             detected during:
2021-05-20T02:07:13.8611185Z               instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:13.8622873Z   (124): here
2021-05-20T02:07:13.8629043Z               instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:13.8635987Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here
2021-05-20T02:07:13.8643388Z               instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=530, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]"
2021-05-20T02:07:13.8651403Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here
2021-05-20T02:07:13.8659583Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=530]"
2021-05-20T02:07:13.8667955Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here
2021-05-20T02:07:13.8675386Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=530, DESCENDING=0, KEYS_ONLY=1]"
2021-05-20T02:07:13.8701063Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here
2021-05-20T02:07:13.8702493Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=530]"
2021-05-20T02:07:13.8704068Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here
2021-05-20T02:07:13.8704854Z               instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]"
2021-05-20T02:07:13.8705723Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here
2021-05-20T02:07:13.8706392Z               instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]"
2021-05-20T02:07:13.8716542Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here
2021-05-20T02:07:13.8717368Z               instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]"
2021-05-20T02:07:13.8718424Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here
2021-05-20T02:07:13.8718667Z 
2021-05-20T02:07:15.1917801Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code
2021-05-20T02:07:15.1925206Z             detected during:
2021-05-20T02:07:15.1931789Z               instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:15.1938059Z   (124): here
2021-05-20T02:07:15.1944662Z               instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:15.1951403Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here
2021-05-20T02:07:15.1958935Z               instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=600, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]"
2021-05-20T02:07:15.1966669Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here
2021-05-20T02:07:15.1974821Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=600]"
2021-05-20T02:07:15.1983258Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here
2021-05-20T02:07:15.2008395Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=600, DESCENDING=0, KEYS_ONLY=1]"
2021-05-20T02:07:15.2010617Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here
2021-05-20T02:07:15.2011979Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=600]"
2021-05-20T02:07:15.2013537Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here
2021-05-20T02:07:15.2014151Z               instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]"
2021-05-20T02:07:15.2015111Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here
2021-05-20T02:07:15.2015803Z               instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]"
2021-05-20T02:07:15.2016694Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here
2021-05-20T02:07:15.2017473Z               instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]"
2021-05-20T02:07:15.2018477Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here
2021-05-20T02:07:15.2018704Z 
2021-05-20T02:07:16.5418795Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code
2021-05-20T02:07:16.5425555Z             detected during:
2021-05-20T02:07:16.5432062Z               instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:16.5438263Z   (124): here
2021-05-20T02:07:16.5444377Z               instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:16.5451005Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here
2021-05-20T02:07:16.5458620Z               instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=610, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]"
2021-05-20T02:07:16.5467070Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here
2021-05-20T02:07:16.5475098Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=610]"
2021-05-20T02:07:16.5484541Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here
2021-05-20T02:07:16.5491950Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=610, DESCENDING=0, KEYS_ONLY=1]"
2021-05-20T02:07:16.5500875Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here
2021-05-20T02:07:16.5537829Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=610]"
2021-05-20T02:07:16.5539734Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here
2021-05-20T02:07:16.5540368Z               instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]"
2021-05-20T02:07:16.5541208Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here
2021-05-20T02:07:16.5541892Z               instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]"
2021-05-20T02:07:16.5542773Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here
2021-05-20T02:07:16.5543566Z               instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]"
2021-05-20T02:07:16.5544571Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here
2021-05-20T02:07:16.5544800Z 
2021-05-20T02:07:17.8837000Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/../block/radix_rank_sort_operations.cuh(102): warning: dynamic initialization in unreachable code
2021-05-20T02:07:17.8837789Z             detected during:
2021-05-20T02:07:17.8838506Z               instantiation of "cub::BaseDigitExtractor<KeyT>::UnsignedBits cub::BaseDigitExtractor<KeyT>::ProcessFloatMinusZero(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:17.8839108Z   (124): here
2021-05-20T02:07:17.8839588Z               instantiation of "uint32_t cub::BFEDigitExtractor<KeyT>::Digit(cub::BaseDigitExtractor<KeyT>::UnsignedBits) [with KeyT=uint_64]"
2021-05-20T02:07:17.8840529Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_rank.cuh(413): here
2021-05-20T02:07:17.8842069Z               instantiation of "void cub::BlockRadixRank<BLOCK_DIM_X, RADIX_BITS, IS_DESCENDING, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(UnsignedBits (&)[KEYS_PER_THREAD], int (&)[KEYS_PER_THREAD], DigitExtractorT) [with BLOCK_DIM_X=128, RADIX_BITS=4, IS_DESCENDING=false, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=700, UnsignedBits=unsigned long long, KEYS_PER_THREAD=8, DigitExtractorT=cub::BFEDigitExtractor<uint_64>]"
2021-05-20T02:07:17.8844155Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(231): here
2021-05-20T02:07:17.8846291Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::RankKeys(cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::UnsignedBits (&)[ITEMS_PER_THREAD], int (&)[ITEMS_PER_THREAD], cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::DigitExtractorT, cub::Int2Type<0>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=700]"
2021-05-20T02:07:17.8848832Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(367): here
2021-05-20T02:07:17.8850383Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], ValueT (&)[ITEMS_PER_THREAD], int, int, cub::Int2Type<DESCENDING>, cub::Int2Type<KEYS_ONLY>) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=700, DESCENDING=0, KEYS_ONLY=1]"
2021-05-20T02:07:17.8852196Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/cub/cub/block/block_radix_sort.cuh(695): here
2021-05-20T02:07:17.8853558Z               instantiation of "void cub::BlockRadixSort<KeyT, BLOCK_DIM_X, ITEMS_PER_THREAD, ValueT, RADIX_BITS, MEMOIZE_OUTER_SCAN, INNER_SCAN_ALGORITHM, SMEM_CONFIG, BLOCK_DIM_Y, BLOCK_DIM_Z, PTX_ARCH>::SortBlockedToStriped(KeyT (&)[ITEMS_PER_THREAD], int, int) [with KeyT=uint_64, BLOCK_DIM_X=128, ITEMS_PER_THREAD=8, ValueT=cub::NullType, RADIX_BITS=4, MEMOIZE_OUTER_SCAN=true, INNER_SCAN_ALGORITHM=cub::BLOCK_SCAN_WARP_SCANS, SMEM_CONFIG=cudaSharedMemBankSizeFourByte, BLOCK_DIM_Y=1, BLOCK_DIM_Z=1, PTX_ARCH=700]"
2021-05-20T02:07:17.8855091Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(38): here
2021-05-20T02:07:17.8855664Z               instantiation of "void BlockSortKernel<Key,BLOCK_THREADS,ITEMS_PER_THREAD>(Key *, Key *) [with Key=uint_64, BLOCK_THREADS=128, ITEMS_PER_THREAD=8]"
2021-05-20T02:07:17.8856481Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(221): here
2021-05-20T02:07:17.8857160Z               instantiation of "void format_nbor_list_1024(uint_64 *, const FPTYPE *, const int *, const deepmd::InputNlist &, const int &, const float &, int *) [with FPTYPE=float]"
2021-05-20T02:07:17.8858025Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(463): here
2021-05-20T02:07:17.8858818Z               instantiation of "void deepmd::format_nbor_list_gpu_cuda(int *, const FPTYPE *, const int *, const deepmd::InputNlist &, int *, uint_64 *, int, int, int, float, std::vector<int, std::allocator<int>>) [with FPTYPE=float]"
2021-05-20T02:07:17.8860049Z   /tmp/pip-req-build-1c7d41rt/source/lib/src/cuda/prod_env_mat.cu(506): here
2021-05-20T02:07:17.8860302Z 
njzjz commented 3 years ago

I am going to remove the support for CUDA 9.2 and 10.0, as they don't support GCC 8 or later.