Closed christophmluscher closed 5 years ago
I also think they've made several interesting experiments: +LSTM instead of Transformer rescoring, their attention+BPE+{LSTM or transformer} is also an interesting approach. @syhw What do you think, is it a point to include multiple well implemented experiments in such cases? Librispeech is very specific, and its results aren't directly applicable to other more complicated domains, so just improving SOTA by less than 2% looks pretty random to me (exactly if it's really a LM improvement, not AM!). So I value the approaches more than the exact results.
It's OK to have a few lines of result for a given paper if they're interesting/important. (But please justify the specifics in the notes.)
I added the mentioned model and expanded on the notes
Maybe expand slightly on the Notes / specificities of the system?