ardha27 / AI-Waifu-Vtuber

AI Vtuber for Streaming on Youtube/Twitch
758 stars 125 forks source link

[Feature Request]: Get answers by writing, and totally free (no Openai API) #13

Closed GiusTex closed 1 year ago

GiusTex commented 1 year ago

Can this webui be integrated into your code? To not use OpenAi Api which is not free... (I know, I'm stingy). Anyway, the webui does not require a powerful computer, and supports several models. And based on this video from minute 17:45 we see that the webui can also generate audio files (using the TTS extension). 1. So, can be added a 3° method, to write and pass the generated message to "speech_text" (if this is the right function) ? 2. Can the generated audio file be used to make the vtuber model's mouth move? I don't know if it's possible to pass the audio with/to the VB-Audio Virtual Cable, or if it's the right method. Maybe this way we don't need the voice synthesizer/text to speech

Edit:

  1. I think that webui is too much complicated to use to write and get answers, to send then to ardha27's app, but maybe this console is the right way. More details in the comments.
  2. I managed to connect the virtual audio cable to the speakers and make the vtuber model talk when the webui produces responses (totally free, without using Openai Api), but I still think it would be nice to be able to switch between 3 modes with ardha27's app: use the mic or livechat when you stream, or just text if you don't stream and you don't want to make too much noise..
ardha27 commented 1 year ago
  1. that webui is using Seliro TTS for the voice synthesizer, and i already apply it on my repo just now, you can check it.
  2. Yes, i use generated audio to make the vtuber model move, you can check on my demo/explain video on readme
GiusTex commented 1 year ago

Sorry I didn't explain myself well (long story); I wanted to write my message, not record it, this is why I suggested that webui. If the text option (the 3° method I mentioned at point one) can be added without the new webui, perfect

ardha27 commented 1 year ago

Sorry i didn't get it well. So you want to use that webui for messaging and my code for doing TTS?

GiusTex commented 1 year ago
ardha27 commented 1 year ago

oke, i understand. I also want to try to use another LLM for my project. But for now i never use other LLM beside OpenAI and i scared if i run LLM locally it will give burden on my pc since i will opening OBS, Vtubestudio for streaming. Because LLM will use many of VRAM https://github.com/oobabooga/text-generation-webui/wiki/System-requirements

GiusTex commented 1 year ago

Understood. And using it on Google collab ? It requires a code a bit different but at least this way we have more computation power

ardha27 commented 1 year ago

i still finding LLM code in colab that provide me API so i can call the API on my code

GiusTex commented 1 year ago

Sorry I don't understand the part about providing the api in your code on colab, can you reformulate? Do you want to keep the api part ? The webui doesnt need the api. Or maybe on colab you add a new mode, mode 3, that calls the webui that doesnt need api, otherwise with the other modes you use the default code with the api. So at the beginning when we install the libraries we install the webui too, and if the user choose mode 3 the console print the browser link to use it with the new code part.

Do you want an example? If you want I can create a demo colab with this idea (your default code + mode 3 with webui)

ardha27 commented 1 year ago

yes please, that will be nice

GiusTex commented 1 year ago

Oke. Edit: 1. I added a demo of a 3° method, which load this chat webui online, doesn't use your pc nor your openai api or deepl api (there is anyway an installed extension that let you translate). There are many text models available, the default one is nice, but there are others quicker and lighter, or others better and havier; many tests can be done with that webui. To download a different model return to the colab page, stop the running cell, choose a different model from the dropdown and then running again the cell, or from here and then downloading it inside the models folder (for advanced users). 2. You can still make the bot move his/her mouth by connecting your speakers with the virtual audio cable. 3. You can't see the subtitles cause I don't know how to save the text output in a file, it would be on colab so we would need a way to download it, and anyway the 3° method is a chat so we can see the output via chat. 4. You can't use the microphone in the online version, while in the local one (on your pc) yes. 5. I tried the gpt4all colab online but is slower so I didn't include it. In the zip there is a new run.py (the changes ar the end) and the colab file, the instructions show up if the user choose the 3° method.