Open windprak opened 3 weeks ago
It looks like PyTorch's C++ extensions are configured with CUDNN_HOME
or CUDNN_PATH
:
https://github.com/pytorch/pytorch/blob/5a80d2df844f9794b3b7ad91eddc7ba762760ad0/torch/utils/cpp_extension.py#L209
PyTorch's build is configured with CUDNN_ROOT
:
https://github.com/pytorch/pytorch/blob/5a80d2df844f9794b3b7ad91eddc7ba762760ad0/cmake/Modules_CUDA_fix/FindCUDNN.cmake#L4
It looks like PyTorch's C++ extensions are configured with
CUDNN_HOME
orCUDNN_PATH
: https://github.com/pytorch/pytorch/blob/5a80d2df844f9794b3b7ad91eddc7ba762760ad0/torch/utils/cpp_extension.py#L209 PyTorch's build is configured withCUDNN_ROOT
: https://github.com/pytorch/pytorch/blob/5a80d2df844f9794b3b7ad91eddc7ba762760ad0/cmake/Modules_CUDA_fix/FindCUDNN.cmake#L4
so what i can do to handle this issue? please give a clear and simple answer thx!
export CUDNN_PATH=/path/to/cudnn
pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable
I try to compile TE on a slurmcluster because containers aren't fully supported (MPI issues). My setup is like this:
All the variables echo well. I can build megatron-lm and apex in this environment, no problem. But not TE.
Error: