[Feature Request]: Get answers by writing, and totally free (no Openai API)

GiusTex commented 1 year ago

Can this webui be integrated into your code? To not use OpenAi Api which is not free... (I know, I'm stingy). Anyway, the webui does not require a powerful computer, and supports several models. And based on this video from minute 17:45 we see that the webui can also generate audio files (using the TTS extension). 1. So, can be added a 3° method, to write and pass the generated message to "speech_text" (if this is the right function) ? 2. Can the generated audio file be used to make the vtuber model's mouth move? I don't know if it's possible to pass the audio with/to the VB-Audio Virtual Cable, or if it's the right method. Maybe this way we don't need the voice synthesizer/text to speech

Edit:

I think that webui is too much complicated to use to write and get answers, to send then to ardha27's app, but maybe this console is the right way. More details in the comments.
I managed to connect the virtual audio cable to the speakers and make the vtuber model talk when the webui produces responses (totally free, without using Openai Api), but I still think it would be nice to be able to switch between 3 modes with ardha27's app: use the mic or livechat when you stream, or just text if you don't stream and you don't want to make too much noise..

ardha27 commented 1 year ago

that webui is using Seliro TTS for the voice synthesizer, and i already apply it on my repo just now, you can check it.
Yes, i use generated audio to make the vtuber model move, you can check on my demo/explain video on readme

GiusTex commented 1 year ago

Sorry I didn't explain myself well (long story); I wanted to write my message, not record it, this is why I suggested that webui. If the text option (the 3° method I mentioned at point one) can be added without the new webui, perfect

ardha27 commented 1 year ago

Sorry i didn't get it well. So you want to use that webui for messaging and my code for doing TTS?

GiusTex commented 1 year ago

I was interested in the text webui because it was the first way I found to not use the paid openai api, but maybe now you can do this: there is also gpt4all here which has it's own console-interface, like yours, very easy to install, and you can just write and get answers (in like 2-3 seconds!?), so maybe you can save the generated output, convert it to a text file (?), and send it to tts ? Maybe in this way we can get unlimited answers free.
All this is because I wanted to get answers by writing, without using the microphone (I always have my brother in my room and I don't want/can't talk into the microphone).
The first text webui (not gpt4all one) also allows you to assign personalities to characters, save conversations and use different models (like Llama that works, and maybe gpt4all in the future), which seemed like a nice addition to me. But this is optional, and maybe more complicated to adapt compared to gpt4all's console. More features for the conversation, more complexity.
I don't know if it's worth using your tts, I'm impressed with the quality of the Japanese version, but if the English one is always based on solero tts (and the English quality doesn't look as good to me as your Japanese version) I don't know how good it will be; (but I admit that I have not tried solero tts english version in your app, cause I dont have openai api access).

ardha27 commented 1 year ago

oke, i understand. I also want to try to use another LLM for my project. But for now i never use other LLM beside OpenAI and i scared if i run LLM locally it will give burden on my pc since i will opening OBS, Vtubestudio for streaming. Because LLM will use many of VRAM https://github.com/oobabooga/text-generation-webui/wiki/System-requirements

GiusTex commented 1 year ago

Understood. And using it on Google collab ? It requires a code a bit different but at least this way we have more computation power

ardha27 commented 1 year ago

i still finding LLM code in colab that provide me API so i can call the API on my code

GiusTex commented 1 year ago

Sorry I don't understand the part about providing the api in your code on colab, can you reformulate? Do you want to keep the api part ? The webui doesnt need the api. Or maybe on colab you add a new mode, mode 3, that calls the webui that doesnt need api, otherwise with the other modes you use the default code with the api. So at the beginning when we install the libraries we install the webui too, and if the user choose mode 3 the console print the browser link to use it with the new code part.

Do you want an example? If you want I can create a demo colab with this idea (your default code + mode 3 with webui)

ardha27 commented 1 year ago

yes please, that will be nice

GiusTex commented 1 year ago

Oke. Edit: 1. I added a demo of a 3° method, which load this chat webui online, doesn't use your pc nor your openai api or deepl api (there is anyway an installed extension that let you translate). There are many text models available, the default one is nice, but there are others quicker and lighter, or others better and havier; many tests can be done with that webui. To download a different model return to the colab page, stop the running cell, choose a different model from the dropdown and then running again the cell, or from here and then downloading it inside the models folder (for advanced users). 2. You can still make the bot move his/her mouth by connecting your speakers with the virtual audio cable. 3. You can't see the subtitles cause I don't know how to save the text output in a file, it would be on colab so we would need a way to download it, and anyway the 3° method is a chat so we can see the output via chat. 4. You can't use the microphone in the online version, while in the local one (on your pc) yes. 5. I tried the gpt4all colab online but is slower so I didn't include it. In the zip there is a new run.py (the changes ar the end) and the colab file, the instructions show up if the user choose the 3° method.

Now the next step would be to use the webui in some way to generate the answers for the other methods, for who (like me) doesn't want to pay for the openai api...
To load the colab webui I used a colab file instead of the official link (which is this) cause if in the future me/ardha27/someone will understand how to show the text output as subtitles, we can save these changes in the file instead of the official link, which is uneditable.
I've also found that tortoise tts here is free and better than silero tts, samples here. There is also a quicker version here but I haven't tested the quicker version. AI-Waifu-Vtuber-ColabTest.zip

ardha27 / AI-Waifu-Vtuber

[Feature Request]: Get answers by writing, and totally free (no Openai API) #13