Closed anhle-uet closed 4 years ago
One of the reasons why it does not work with new PyTorch is new CUDA 10.1. Here is the quote from the release notes:
The non-sync definitions of warp shuffle functions ( shfl, shfl_up, shfl_down, and __shfl_xor ) and warp vote functions (any, all, ballot) have been removed when compilation is targeting devices with compute capability 7.x and higher.
There is one usage of __shfl_down
in warp-ctc/src/reduce.cu
which you can replace to __shfl_down_sync
.
Unfortunately, the Baidu's ctc implementation depends on ModernGPU, which has not been updated for a long time and also uses __shfl
functions. Thus, you also have to patch warp-ctc/include/contrib/moderngpu/include/device/intrinsics.cuh
.
I will recommend you to use the official PyTorch implementation of ctc_loss, even though I don’t like this spaghetti code LossCTC.cu.
It seems like after the bug fix #27460 PyTorch's implementation is stable.
@1ytic Well, I've tried many times, but the official implementation doesn't work for me.
@jpuigcerver Do you intend to continue to maintain this repo? :)
Dear users, since PyTorch now includes its own CTC implementation, I'd suggest to use it instead of Baidu's.
Unfortunately, I don't have the time to fix Baidu's implementation myself.
Your work is great since the official implementation of Pytorch doesn't work for me. Could you release newer version that capable with Pytorch 1.2.0 and 1.3.0? Thanks alot!