Open catid opened 5 months ago
The issue seems to be on this line:
https://github.com/Doraemonzzz/hgru2-pytorch/blob/1e45622e15cf85a257f185d6bd351eadfc1a0444/setup.py#L30
creating build/temp.linux-x86_64-cpython-310/hgru2_pytorch/hgru_real_cuda gcc -pthread -B /home/catid/mambaforge/envs/train/compiler_compat -Wno-unused-result -Wsign-compare -DNDEBUG -fwrapv -O2 -Wall -fPIC -O2 -isystem /home/catid/mambaforge/envs/train/include -fPIC -O2 -isystem /home/catid/mambaforge/envs/train/include -fPIC -I/home/catid/mambaforge/envs/train/lib/python3.10/site-packages/torch/include -I/home/catid/mambaforge/envs/train/lib/python3.10/site-packages/torch/include/torch/csrc/api/include -I/home/catid/mambaforge/envs/train/lib/python3.10/site-packages/torch/include/TH -I/home/catid/mambaforge/envs/train/lib/python3.10/site-packages/torch/include/THC -I/usr/local/cuda/include -I/home/catid/mambaforge/envs/train/include/python3.10 -c hgru2_pytorch/hgru_real_cuda/hgru_real_cuda.cpp -o build/temp.linux-x86_64-cpython-310/hgru2_pytorch/hgru_real_cuda/hgru_real_cuda.o -O2 -std=c++14 -D_GLIBCXX_USE_CXX11_ABI=0 -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -DTORCH_EXTENSION_NAME=hgru_real_cuda -D_GLIBCXX_USE_CXX11_ABI=0 In file included from /home/catid/mambaforge/envs/train/lib/python3.10/site-packages/torch/include/torch/extension.h:5, from hgru2_pytorch/hgru_real_cuda/hgru_real_cuda.cpp:1: /home/catid/mambaforge/envs/train/lib/python3.10/site-packages/torch/include/torch/csrc/api/include/torch/all.h:4:2: error: #error C++17 or later compatible compiler is required to use PyTorch.
Maybe we could get a triton version of this too to avoid the C++ code?
Proposed fix here: https://github.com/Doraemonzzz/hgru2-pytorch/pull/2
Thank you for your suggestion, I will replace this kernel with triton code later next week.
The issue seems to be on this line:
https://github.com/Doraemonzzz/hgru2-pytorch/blob/1e45622e15cf85a257f185d6bd351eadfc1a0444/setup.py#L30
Maybe we could get a triton version of this too to avoid the C++ code?