Closed iceychris closed 3 years ago
Thank you! Looks simple. Unfortunately I will not be able to test the final streaming performance, but I will happily add this code. Could you fix core_gather.cu
as well? This is the most efficient implementation and should be used by default.
Nice, thank you!
Hey!
Thank you for this great library!
This PR implements
FastEmit
regularization from https://arxiv.org/abs/2010.11148. The gradients of all non-blank symbols are scaled up by a small factor (around1.004
). Intuitively, the model is encouraged to output symbols faster, thus reducing latency.