Open scutcsq opened 10 months ago
Could you describe how you installed fast_rnnt?
Could you describe how you installed fast_rnnt?
I used pip to install fast_rnnt. Now I have installed the k2 and the problem is solved by using the function in k2.
Hi, we had the same error after the successful building fast_rnnt for AMD using Rocm 5.4 with correct installed pytorch 2.0.1 and torchaudio 0.15.2
File "/home/ubnt/anaconda3/lib/python3.8/site-packages/fast_rnnt/rnnt_loss.py", line 533, in rnnt_loss
scores_and_grads = mutual_information_recursion(
File "/home/ubnt/anaconda3/lib/python3.8/site-packages/fast_rnnt/mutual_information.py", line 294, in mutual_information_recursion
scores = MutualInformationRecursionFunction.apply(
File "/home/ubnt/anaconda3/lib/python3.8/site-packages/torch/autograd/function.py", line 506, in apply
return super().apply(*args, **kwargs) # type: ignore[misc]
File "/home/ubnt/anaconda3/lib/python3.8/site-packages/fast_rnnt/mutual_information.py", line 157, in forward
ans = _fast_rnnt.mutual_information_forward(px, py, boundary, p)
RuntimeError: Failed to find native CUDA module, make sure that you compiled the code with K2_WITH_CUDA.
We want to use only fast_rnnt without k2. We installed it via build from source
git clone https://github.com/danpovey/fast_rnnt.git
cd fast_rnnt
export FT_MAKE_ARGS="-j32"
pip install --verbose fast_rnnt
It seems that Rocm isn't supported in the build.
-- No NVCC detected. Disable CUDA support
@bene-ges Basically if pytorch can run on Rocm, fast_rnnt can also run on it. Will have a look at this issue. Thanks!
But the core of fast_rnnt is the CUDA code, no? And I believe Rocm does not use cuda? So would require rewrite to support that??
@danpovey, rocm can compile CUDA code into the amd binary. Most of projects just add the rocm compile commands like Pytorch does. So the Pytorch build system can be an example of right solution Docs
Example of conversion of CUDA code to ROCm code and its compilation (matrix-cuda is just example of cuda code)
on ubuntu:
git clone https://github.com/lzhengchun/matrix-cuda
cd matrix-cuda
/opt/rocm-5.3.0/bin/hipify-clang matrix_cuda.cu
After this a file matrix_cuda.cu.hip
will appear which is source code for ROCm.
Then it can be compiled with HIPCC
/opt/rocm-5.3.0/bin/hipсс matrix_cuda.cu.hip
After this file a.out
will appear
another useful link on porting CUDA (all notations almost identical) https://www.lumi-supercomputer.eu/preparing-codes-for-lumi-converting-cuda-applications-to-hip/
I can help with testing on amd if needed
OK that's interesting. If it's possible for you to add support for ROCM into our build system (which is I think not entirely trivial), then I think we'd appreciate that very much. This kind of thing will no doubt be used more frequently in the future. (Also: apologies for the very late response.)
RuntimeError: Failed to find native CUDA module, make sure that you compiled the code with K2_WITH_CUDA.