undreamai / LLMUnity

Create characters in Unity with LLMs!
MIT License
447 stars 45 forks source link

Integrate text-to-speech and speech-to-text functionality #44

Open amakropoulos opened 5 months ago

ArEnSc commented 4 months ago

please make this an optional package that is separate

amakropoulos commented 4 months ago

Yes certainly, it will be possible to attach STT or TTS to the chat functionality but it will not be enabled by default.

simoninithomas commented 3 months ago

Hey there 👋 , will you use Sentis for STT and TTS? Or do you have another idea?

We have some Sentis model on the Hub that are super fast (Tiny Whisper and Jets).

Tiny Whisper: https://huggingface.co/unity/sentis-whisper-tiny Jets: https://huggingface.co/unity/sentis-jets-text-to-speech

Demo with Whisper: https://singularite.itch.io/jammo-the-robot-with-unity-sentis-whisper-version

amakropoulos commented 3 months ago

Hi, thank you for the suggestions! I need to do a small exploration first, but yes I was thinking to start with your Whisper-Tiny model 🙂. Ideally I would like to support a range of models e.g. similarly to whisper.cpp project but need to have it working cross-platform in Unity which is work-in-progress (link).

By the way, thanks a lot for your great work on the sharp-transformers ⭐! I'm using it in the other repo, RAGSearchUnity, to build a RAG similarity search system!

siddhant-bharti commented 3 months ago

Hi @amakropoulos : I want this functionality for a project I am building! Are you planning to add this soon? I can help raise a PR for this functionality too if you are fine with this? Looking forward to hearing from you. Thanks!

amakropoulos commented 3 months ago

@siddhant-bharti I'm replying here as well :). This is the next big feature that I'll work on soon.

@simoninithomas I can't use Jets because it has a cc-by-4.0 license. The Unity Asset store does not allow packages with licenses that require attribution and I'd like LLM for Unity to be there as well (p.s. we are live on asset store as of last week :tada: !)

amakropoulos commented 2 months ago

This feature is blocked at the moment. I can't find an open-source library for TTS to integrate that fulfills the following requirements:

The best solution would be Piper but at the moment has a potential license issue due to to using espeak (link).

Pipsun commented 2 months ago

This feature is blocked at the moment. I can't find an open-source library for TTS to integrate that fulfills the following requirements:

  • C/C++/C# code without many dependencies
  • MIT/Apache 2.0 or any other equivalent license that is open-source and attribution-free
  • allow multiple voices

The best solution would be Piper but at the moment has a potential license issue due to to using espeak (link).

Hello, i've made integration of your project with openCV for facetracking, vroid as avatar, vosk stt and piper tts, but i think that the most interesting is integration with rvc, but have no time for this. Maybe you know something about ready to use RVC Unity integrations?

Swiftyos commented 2 months ago

Adding TTS and STT functionality would take llamafile to the next level!

SubatomicPlanets commented 1 week ago

I also think that piper is the best choice for TTS and whisper for STT. I made a project using the UnityPiper and whisper.unity projects. It works, but it was a bit complicated getting it all to work. I also found Piper without espeak but I don't know how well it works.