divyenduz / languagelearners

LingoParrot - LanguageLearners.club
1 stars 0 forks source link

Listen and transcribe #4

Open divyenduz opened 5 years ago

divyenduz commented 5 years ago

Like listening, speaking is also very important. This is where LingoParrot can listen to the user and use Amazon Transcribe to put it back to text.

This will help the user try to say things out loud and help them with this aspect of language learning.

Related post: https://developer.ibm.com/answers/questions/424777/help-how-do-i-use-speech-to-text-with-my-telegram/

divyenduz commented 5 years ago

Streaming transcribe is not supported yet: https://github.com/aws/aws-sdk-js/issues/2416

Need to use the start job and poll method.

divyenduz commented 5 years ago

This feature has landed in @LingoParrotDevBot with a hardcoded mp3, as ogg to mp3 conversion in memory is still pending.

It is pretty unusable because of the following reasons:-

  1. No UX at the moment, the bot just prints back the transcription. This should be a game-esque flow.
  2. AWS transcribe does not support German (even though its TS types show "de-DE").
  3. The JS/TS SDK of AWS - Transcribe does not support real-time transcription yet: https://github.com/aws/aws-sdk-js/issues/2416. Hence, I used the "submit job" + "poll it" method and it is taking 30-90 sec for smaller voice recordings (1-2 sec). This is without the ogg to mp3 conversion. Also, AWS transcribe only support source file residing on S3, which is where I had to move the voice from telegram server to S3, adding to the slowness. The slowness bit might be mitigated by making the heavier operations async by using chosen_inline_result update type.
  4. This cannot be deployed to Lambda because this function passes through API Gateway and that has an imposed timeout of 30 sec. Although, this is somehow working for the Lambda deployed for @LingoParrotDevBot with around ~60 sec execution time.

In its current state, this can't be anyway near being user ready. This feature will be hidden unless multiple improvements are made.

To try it out in its current state, just send a voice towards the bot (English supported)

divyenduz commented 5 years ago

I am not sure about the use case of Amazon Lex but maybe that can be used for this.

divyenduz commented 5 years ago

This feature is behind a feature flag and disabled for now because of its short comings.