PeoplePlusAI / sunva-ai

SUNVA AI: Seamless conversation loop for the deaf
13 stars 3 forks source link

LLM to filter irregular speech #18

Open gksoriginals opened 3 months ago

gksoriginals commented 3 months ago

The deaf person will also try to speak sometimes. We need a model to figure out if the transcription is poor quality or not and if it is below a particular threshold we should not show it.

gksoriginals commented 3 months ago

@bsbarkur thoughts on this ?

maximaminima commented 3 months ago

Sure, the proper way to do this perhaps is:

  1. Offer the choice of speaking OR
  2. Typing leveraging TTS to participate in the loop.

And try out both of these modes above! Person should have freedom to select one of these above.

gksoriginals commented 3 months ago

Yes. In the first case we can collect some irregular speech data in privacy preserved way to train the models for them in a later stage

maximaminima commented 3 months ago

https://sites.research.google/euphonia/about/ https://www.isca-archive.org/interspeech_2021/macdonald21_interspeech.pdf

References to read for irregular speech data training @gksoriginals

gksoriginals commented 3 months ago

@maximaminima I was thinking about a postprocessing layer using an LLM to fix the sentences.

computationalmama commented 3 months ago

@maximaminima I was thinking about a postprocessing layer using an LLM to fix the sentences.

This seems like a good + quick solution. We just need to ensure that there is an option which can make sure the LLM identifies the speaker language. Would it work?

gksoriginals commented 2 months ago

@ambikajo but we need to finetune ASR models for it to understand the irregular speech of a deaf person. I guess this is a bit complex. Speaker language identification most of the ASR models/apis do.

computationalmama commented 2 months ago

Ok I understand what you mean now. I can do an ASR eval if you have some samples with the transcriptions (on Gooey)