There is a feature available on Web-based Twitch via an extension that provides closed captions dynamically generated from the audio.
The voice recognition is contextual; as more words are spoken the interpretation of previous speech that is still on screen can be updated
It currently presents as a ~ 3 line flow of undifferentiated words without sentence punctuation or any kind of spacing; it seems likely to me that they will iterate on this feature to add this or some other kind of visual indication of words that are separated by longer pauses
There is a feature available on Web-based Twitch via an extension that provides closed captions dynamically generated from the audio.