Handle Incoming Audio - Githubissues

Hello @Bruno-val-bus

In our current User Story, the educator selects the child that is reading. However, after the child has read, the educator will ask questions and the child will answer (possibly with questions from the child's side). How do we handle this conversation (regardless of our currently defined Use Cases?)

On the one hand, our POC should demonstrate that we can evaluate the provided (text) data according to the use case
On the other hand, our POC should demonstrate that our entire system makes the teaching process significantly easier.

So this issue should be solved as part of our POC, as it will show the simplicity of the entire process.

Two possibilities to solve this issue:
1. The robust solution (handle multiple speakers just talking, automate everything) is mentioned in issue #10
2. We Use Whisper LLM to transcribe the entire conversation (which we assume is happening between teacher and selected student), and then prompt another of our selected LLMs, to split up this conversation (based on the back and forth of the conversation), prompting it to filter out text that doesn't make sense etc. (errors from transcript or so). This workaround was mentioned here.
We can already implement the second solution with the setup that we currently have (locally and online), so I would suggest we do this and then test it out with the data which should be arriving soon.

What do you guys think?

Bruno-val-bus / student-helper

Handle Incoming Audio #11