NVIDIA-Merlin / HugeCTR

HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training
Apache License 2.0
905 stars 196 forks source link

[BUG] cooperative_groups/scan.h not in cuda11.X #416

Open MichoChan opened 10 months ago

MichoChan commented 10 months ago

how to fix it?

MichoChan commented 10 months ago

use cub ?

KingsleyLiu-NV commented 10 months ago

Hi @MichoChan , can you elaborate the CUDA version at which the header file is missing? If you use NGC container, e.g., nvcr.io/nvidia/pytorch:22.12-py3 with CUDA 11.8, there is no such problem:

/usr/local/cuda/include/cooperative_groups# ls -l
total 16
drwxr-xr-x 2 root root 4096 Dec 14  2022 details
-rw-r--r-- 1 root root 2960 Sep 21  2022 memcpy_async.h
-rw-r--r-- 1 root root 2949 Sep 21  2022 reduce.h
-rw-r--r-- 1 root root 2940 Sep 21  2022 scan.h

/usr/local/cuda/include/cooperative_groups# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
MichoChan commented 10 months ago

Hi @MichoChan , can you elaborate the CUDA version at which the header file is missing? If you use NGC container, e.g., nvcr.io/nvidia/pytorch:22.12-py3 with CUDA 11.8, there is no such problem:

/usr/local/cuda/include/cooperative_groups# ls -l
total 16
drwxr-xr-x 2 root root 4096 Dec 14  2022 details
-rw-r--r-- 1 root root 2960 Sep 21  2022 memcpy_async.h
-rw-r--r-- 1 root root 2949 Sep 21  2022 reduce.h
-rw-r--r-- 1 root root 2940 Sep 21  2022 scan.h

/usr/local/cuda/include/cooperative_groups# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

cuda 11.2 in my env

KingsleyLiu-NV commented 10 months ago

Hi @MichoChan , if you have to use cuda 11.2, it is good for you to leverage cub or thrust.

/usr/local/cuda/include# find -name "*scan*h"
./thrust/scan.h
...
./cub/agent/agent_scan.cuh
...

/usr/local/cuda/include# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:08:53_PST_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0
MichoChan commented 10 months ago

Hi @MichoChan , if you have to use cuda 11.2, it is good for you to leverage cub or thrust.

/usr/local/cuda/include# find -name "*scan*h"
./thrust/scan.h
...
./cub/agent/agent_scan.cuh
...

/usr/local/cuda/include# nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:08:53_PST_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0

thanks so mcuh