This is a simple implementation of the paper STATE-OF-THE-ART SPEECH RECOGNITION USING MULTI-STREAM SELF-ATTENTION WITH DILATED 1D CONVOLUTIONS in pytorch. Please visit main.ipynb for training and inference details.
main.ipynb