HazyResearch / ThunderKittens

Tile primitives for speedy kernels
MIT License
1.47k stars 55 forks source link

c++20 does not work? #45

Open ziyuhuang123 opened 1 month ago

ziyuhuang123 commented 1 month ago

Hi! I am running on 4090 for example/attn/4090, using nvcc=12.3, gcc and g++=10, but meet error below:

(py_hzy_new) 4090-01% make                     
nvcc -ccbin=/home/zyhuang/miniconda3/envs/py_hzy_new/bin/g++ -DNDEBUG -Xcompiler=-fPIE --expt-extended-lambda --expt-relaxed-constexpr -Xcompiler=-Wno-psabi -Xcompiler=-fno-strict-aliasing --use_fast_math -forward-unknown-to-host-compiler -O3 -Xnvlink=--verbose -Xptxas=--verbose -Xptxas=--warn-on-spills -std=c++20 -MD -MT -MF -x cu -lrt -lpthread -ldl -DKITTENS_4090 -arch=sm_89 -lcuda -lcudadevrt -lcudart_static -lcublas 4090_ker.cu -o attn_fwd
../../../src/common/base_types.cuh(93): error: namespace "std" has no member "bit_cast"
      static __attribute__((device)) inline constexpr bf16 zero() { return std::bit_cast<__nv_bfloat16>(uint16_t(0x0000)); }

I guess.... c++20 is still not supported somehow?

ahepp commented 1 month ago

I was able to replicate the issue, and resolved it by using gcc/g++ 11