huggingface / speech-to-speech

Speech To Speech: an effort for an open-sourced and modular GPT4-o
Apache License 2.0
3.51k stars 364 forks source link

Client side log/chatlog #126

Open mattfro opened 3 weeks ago

mattfro commented 3 weeks ago

Is it possible to have the logs streamed to client? At least to get the speech to text and text to speech, like a dialog?

Does it support it already or something that could be added?

Reason would be if you run it another device it would be easier to see what really llm hears or the stt recognizes and also if you run it without a speaker(or muted) to see what llm says, and maybe the text could be used on the client side for some function calling to do something or call some api or whatever.

andimarafioti commented 2 weeks ago

Yes, this would be useful! And it doesn't add too much complexity or data to the packages. The main reason why it's not done is that the speech is continuous and the text is more like a once and done thing. But we should work on it. We welcome contributions as well!

mattfro commented 1 week ago

Yes, this would be useful! And it doesn't add too much complexity or data to the packages. The main reason why it's not done is that the speech is continuous and the text is more like a once and done thing. But we should work on it. We welcome contributions as well!

I wish, I mean I would...If I would have any coding skills :) using code snipets here and there and trying to get stuff working together.