Open RapidRabbit-11485 opened 1 year ago
This requires a custom TTS client to support connecting and streaming data to two different places simultaneously. Currently joining Discord calls is not a feature of Speaker.bot. We will look into this as a possible 2.0 feature.
This is going to take a custom client app to do. Similar to the implementation of Speech-to-Text. I also need to learn the Discord Voice API, unless we can find a specialist in the community that has worked with it before. Definitely in the 2.0 conversation, or PNGTuber-GPT Pro, whatever shape it takes as we continue on the roadmap.
What kind of use case is this for? Just outputting the bot audio to the calls along with the stream?
So, how this came about, was that people who are playing games together on stream; frequently use Discord. Once they hit a certain level, they are probably using something more sophisticated, like VDO.Ninja. Some people actually send their bot and the bot audio through VDO.Ninja so that everyone can hear the bot when it talks. However, it was brought up that it would just be simpler if what the bot was saying could play on the discord call, without taking over the mic from discord from the streamer. If it could use the discord API to play directly to a bot account on the call; this would be better. However, the practicality of implementing that has seen this get slightly back burnered over some other more critical features.
It was also brought up that once this was implemented, it could be further enhanced by other features in development, like speech-to-text, to actually have the bot listen to discord and take commands from the users in it. There are definitely solid use cases; but it hasn't seen such strong support, and it hasn't had priority for development yet.
I haven't heard of VDO.ninja before, but it seems really neat. If I'm following right, the goal is to create a separate bot user to join a discord call for playing bot audio, because using something like VDO.ninja feels a bit extra, and to play it in Discord currently requires changing the input device in Discord to the bot's virtual cable, so the user and the bot can't both be talking at once (and doesn't sound very efficient at all anyway)
I wish I had any knowledge on discord's API to contribute, but I think a workable solution for having the bot output in a Discord call would be to set up VB-Audio VoiceMeeter. You can set up input from the user's microphone for one source, the bot's VB-Audio cable for the second source, set the output to whatever the primary output is, mute the physical microphone's A channel so there's no echo, and finally change the mic in Discord to the VoiceMeeter output. This also makes it so there's no need to set up the "monitor output" in OBS's advanced audio properties either, because it will play through VoiceMeeter.
You're following correctly, for sure. For most folks here, Virtual Audio Cable is probably their first exposure to this kind of thing with audio routing. I do agree with your solution, it just takes a bit a know-how. Also, some of the streamers already use VoiceMeeter somewhere in their tool chain, so it adds more complication to then use it for Discord too. I'm honestly not all that familiar with it, other than being an audio bus tool for routing inputs and outputs. I just know it had come up in conversations as well. I don't have enough knowledge at this point to have a workable solution with Discord either, but this is here to kind of track it, so I don't forget to look into it further.
The bot needs to be able to join Discord calls so that people streaming together can all hear the bots responses.