Alexander-H-Liu / End-to-end-ASR-Pytorch

This is an open source project (formerly named Listen, Attend and Spell - PyTorch Implementation) for end-to-end ASR implemented with Pytorch, the well known deep learning toolkit.
MIT License
1.19k stars 318 forks source link

Questions regarding ctc prefix score algorithm #16

Closed Chung-I closed 5 years ago

Chung-I commented 5 years ago

https://github.com/Alexander-H-Liu/End-to-end-ASR-Pytorch/blob/77b657b7004cabfd56076a818cecc0ce855f6b0a/src/ctc.py#L50-L60

Hi there ! Thanks for implementing such a great SoTA end-to-end ASR toolkit ! Really appreciate the complicated joint decoding algorithm part. I'm a little bit confused about the implementation of ctc prefix score decoding. In ctc.py, line 51, I'm not sure whether the last char dim of prev_blank[last_char] should be assigned logzero or not. Would it make a little bit more sense if prev_nonblank[last_char] be assigned to logzero ? 2019-02-27 13-25-42 The phd thesis of Alex Graves mentioned that lin line 17, the nonblank part of newLabelProb is (log)zero if p* ends in k, which might corresponds to the last_char dim of prev_nonblank ?

Thanks again for all the hard work !

Alexander-H-Liu commented 5 years ago

@Chung-I you're right... this is a mistake I made, this also addressed the bug of weird <eos> probability given at the end of decoding. The bug was fixed in #17 Thanks a lot!