Generated audio fails to play after 3rd audio

Adri6336 / gpt-voice-conversation-chatbot

Allows you to have an engaging and safely emotive spoken / CLI conversation with the AI ChatGPT / GPT-4 while giving you the option to let it remember things discussed.

GNU General Public License v3.0

298 stars 50 forks source link

Generated audio fails to play after 3rd audio #2

Closed Gottano closed 1 year ago

Gottano commented 1 year ago

This could be a problem on my end, please excuse me if that's the case but Im new to this.

I have to run MVS as admin to make the playback work (Im not using Eleven Labs api), but it only generates three audio files and then starts giving the following error:

` Error 259 for command: play ./messages/3-5-2023/6.mp3 wait The driver cannot recognize the specified command parameter.

Error 263 for command:
    close ./messages/3-5-2023/6.mp3
The specified device is not open or is not recognized by MCI.

Failed to close the file: ./messages/3-5-2023/6.mp3 [i]: DETECTED LANGUAGE: en [X] Error trying to speak: [Errno 13] Permission denied: './messages/3-5-2023/6.mp3'`

Adri6336 commented 1 year ago

Heyo! I'm sorry to hear that this script is giving you errors. I'm not terribly certain what's going on, so let me see if I can work though this with you.

So that error indicates that you may not have the permission to interact with that file via the bot. I'm not sure why, but it's probably related to what you're running as admin. I've been talking with the bot to see if it has any knowledge as to solving your problem. It's as confused as I am as to the cause, but says this about your combination of errors:

This error message is indicating that the specified device, in this case the file './messages/3-5-2023/6.mp3', is not open or recognized by the MCI (Media Control Interface). Additionally, the error message '[Errno 13] Permission denied' suggests that the user account being used to access the file does not have the necessary permissions to do so. It might be necessary to check the file permissions and ensure that the right account has access to it.

I also noticed that you mentioned running MVS as admin. Could you expand on what you mean by MVS please?

Gottano commented 1 year ago

Hey there, thanks for the quick reply.

Yes I also asked the bot and it suggested that I run the script as admin, which I've been doing on MVS -> Microsoft Visual Studio.

What do you use to run it ?

On Sun, 5 Mar 2023, 23:33 Adrian, @.***> wrote:

Heyo! I'm sorry to hear that this script is giving you errors. I'm not terribly certain what's going on, so let me see if I can work though this with you.

So that error indicates that you may not have the permission to interact with that file via the bot. I'm not sure why, but it's probably related to what you're running as admin. I've been talking with the bot to see if it has any knowledge as to solving your problem. It's as confused as I am as to the cause, but says this about your combination of errors:

This error message is indicating that the specified device, in this case the file './messages/3-5-2023/6.mp3', is not open or recognized by the MCI (Media Control Interface). Additionally, the error message '[Errno 13] Permission denied' suggests that the user account being used to access the file does not have the necessary permissions to do so. It might be necessary to check the file permissions and ensure that the right account has access to it.

I also noticed that you mentioned running MVS as admin. Could you expand on what you mean by MVS please?

— Reply to this email directly, view it on GitHub https://github.com/Adri6336/gpt-voice-conversation-chatbot/issues/2#issuecomment-1455341912, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQUWAS64333WYDBCZWTKOETW2VEIPANCNFSM6AAAAAAVQK3IEM . You are receiving this because you are subscribed to this thread.Message ID: @.*** com>

Adri6336 commented 1 year ago

That may be the issue's cause then. I have never tested this script being ran on Visual Studio, so this bug would never have popped up for me during development. I'll see if I can try to replicate this error on my end and figure out how to fix it for other VS users.

For the time being, let's see if we can get it working properly on your end using a different method. For the most part I exclusively run the script either via the terminal / console (using Powershell itself or Windows Terminal) or via a shortcut I made that has the Python interpreter run the script when I double click it. The console way is the most reliable way from my experience. To get started using a console, please try the following:

1. Hold down Windows + R.

This should open a small textbox. Enter into the textbox "powershell", then press enter. You should then see a Powershell window open (for some reason on my end it opens a bit off screen, so feel free to move it around to give it a good fit).

If this doesn't pop up for some reason, right click the Windows logo and select "Run". The box should then open.

2. Navigate to the folder where you've got the bot's script.

Idk how this would differ from the Visual Studio experience, or if you're familiar with the command line. In case you're unfamiliar I'll give you a brief summary of how to get to and fro (feel free to skip this if you're familiar). You will need to use a couple of commands: ls and cd.

ls: This will show you the contents of your current folder. Towards the left hand of the screen you'll see some text like: "-a----". Any item on a line that starts with a 'd' like "d-r---" is a folder that you can navigate to. Once you recognize the folder you're looking for, enter it using the cd command.

cd: When you use this command, you will move into a new folder. You can use "cls" to clear the screen of your previous folder's contents. To properly use this command, you will need to enter the following:

cd <folder name>

Please note that any folder with a space in its name will need to be referenced in quotations. For example, a folder named "cheese types" would be navigated to via the cd command as:

cd "cheese types"

If you accidentally go into the wrong folder, you can go back up one with ".." as follows:

cd ..

3. Once you are in the proper folder (where main.py and gptcli.py are at), you can run the script by entering the following command:

python main.py

If the key is not in the text file, you can start by entering it as an argument:

python main.py <openai-key>

4. (Reinstall requirements if they aren't installed outside of VS)

I don't know much about how VS runs python scripts, but if it's anything like PyCharm, there's a good chance that you may have installed your requirements to a virtual environment. If this is the case, you'll need to install the requirements again using pip as a user (that is, not admin). You can do this with:

pip install -r requirements.txt

Please try that and let me know if it works for you.

Gottano commented 1 year ago

Hello again, thanks a lot for your help on this.

I got the same looking error in Powershell. I wasn't using Eleven Labs API, so I switched to that (I didnt realise they had a free tier) and it worked quite well, until I changed the subject and it broke!

The "listening" green screen does seem to hang sometimes, but it eventually snaps out of it. It sometimes does seem quite sluggish as it processes ....is this normal

On Mon, 6 Mar 2023 at 11:40, Adrian @.***> wrote:

That may be the issue's cause then. I have never tested this script being ran on Visual Studio, so this bug would never have popped up for me during development. I'll see if I can try to replicate this error on my end and figure out how to fix it for other VS users.

For the time being, let's see if we can get it working properly on your end using a different method. For the most part I exclusively run the script either via the terminal / console (using Powershell itself or Windows Terminal) or via a shortcut I made that has the Python interpreter run the script when I double click it. The console way is the most reliable way from my experience. To get started using a console, please try the following:

Hold down Windows + R.

This should open a small textbox. Enter into the textbox "powershell", then press enter. You should then see a Powershell window open (for some reason on my end it opens a bit off screen, so feel free to move it around to give it a good fit).

If this doesn't pop up for some reason, right click the Windows logo and select "Run". The box should then open.

Navigate to the folder where you've got the bot's script.

Idk how this would differ from the Visual Studio experience, or if you're familiar with the command line. In case you're unfamiliar I'll give you a brief summary of how to get to and fro (feel free to skip this if you're familiar). You will need to use a couple of commands: ls and cd.

ls: This will show you the contents of your current folder. Towards the left hand of the screen you'll see some text like: "-a----". Any item on a line that starts with a 'd' like "d-r---" is a folder that you can navigate to. Once you recognize the folder you're looking for, enter it using the cd command.

cd: When you use this command, you will move into a new folder. You can use "cls" to clear the screen of your previous folder's contents. To properly use this command, you will need to enter the following:

cd

Please note that any folder with a space in its name will need to be referenced in quotations. For example, a folder named "cheese types" would be navigated to via the cd command as:

cd "cheese types"

If you accidentally go into the wrong folder, you can go back up one with ".." as follows:

cd ..

Once you are in the proper folder (where main.py and gptcli.py are at), you can run the script by entering the following command:

python main.py

If the key is not in the text file, you can start by entering it as an argument:

python main.py

I don't know much about how VS runs python scripts, but if it's anything like PyCharm, there's a good chance that you may have installed your requirements to a virtual environment. If this is the case, you'll need to install the requirements again using pip as a user (that is, not admin). You can do this with:

pip install -r requirements.txt

Please try that and let me know if it works for you.

— Reply to this email directly, view it on GitHub https://github.com/Adri6336/gpt-voice-conversation-chatbot/issues/2#issuecomment-1456263351, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQUWASZI7JTQCJNTNI4Q3ZDW2XZOPANCNFSM6AAAAAAVQK3IEM . You are receiving this because you are subscribed to this thread.Message ID: @.*** com>

Adri6336 commented 1 year ago

So, I've been trying to replicate your error and haven't been able to. Are you using an up-to-date version? I remember having a similar issue at a point in the past that I fixed (regarding the errors playing sound).

Could you expand on what you mean by it breaking when changing the subject, was it the same issue as before?

As for the ElevenLabs speed, yeah it do be quite slow. Unfortunately that's just an issue with the resources available to them and they way they manage requests. Even with a paid tier it does have latency.

The green listening screen usually stays on if there's too much background noise, but sometimes it can occur without a noticeable reason. This unfortunately has to do with the 3rd party modules that I use, so I don't have the ability to do much to affect it.

Gottano commented 1 year ago

I'm using the most up to date version.

The error, after it "breaks" seems to be the same as before. It does safe the audio file (but doesn't play it) and when I listened to it, it had reverted back to the default / flat voice (not the ElevenLabs one).

On Mon, 6 Mar 2023, 17:39 Adrian, @.***> wrote:

So, I've been trying to replicate your error and haven't been able to. Are you using an up-to-date version? I remember having a similar issue at a point in the past that I fixed (regarding the errors playing sound).

Could you expand on what you mean by it breaking when changing the subject, was it the same issue as before?

As for the ElevenLabs speed, yeah it do be quite slow. Unfortunately that's just an issue with the resources available to them and they way they manage requests. Even with a paid tier it does have latency.

The green listening screen usually stays on if there's too much background noise, but sometimes it can occur without a noticeable reason. This unfortunately has to do with the 3rd party modules that I use, so I don't have the ability to do much to affect it.

— Reply to this email directly, view it on GitHub https://github.com/Adri6336/gpt-voice-conversation-chatbot/issues/2#issuecomment-1456943810, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQUWASZBRZ47VZ6GYBSG7D3W2ZDP3ANCNFSM6AAAAAAVQK3IEM . You are receiving this because you are subscribed to this thread.Message ID: @.*** com>

Adri6336 commented 1 year ago

The only issue related to error 259 that I can find online point to the playsound module, but all the fixes are to downgrade your installation to the version 1.2.2 which is what is listed on the requirements.txt file and should already be downloaded. Maybe if you try reinstalling playsound, the script will work. I apologize for the inconvenience. To do this, try uninstalling playsound with,

pip uninstall playsound

then run this command:

pip install -r requirements.txt

If that doesn't work, try removing it again and running this command:

pip install playsound==1.2.2

Hopefully that will fix the issue. Else to that, as far as I can tell based on what we've discussed and what I could find online, the problem has to do with how playsound is working on your OS. It's not a universal issue unfortunately, so I haven't been able to replicate it where I'm at (I'm using an up-to-date Windows 10 laptop atm) and find a fix.

These were some of the sources that I found regarding this problem:

https://stackoverflow.com/questions/69245722/error-259-on-python-playsound-unable-to-sound

https://stackoverflow.com/questions/69065485/sound-file-will-not-play-using-playsound-module-python-error-259

Adri6336 commented 1 year ago

(also btw the trial for ElevenLabs runs out quickly, so you should be shifted back to the Google speech when you run out of tokens)

Gottano commented 1 year ago

Ok thanks a lot. Will try your suggestion 😉

On Mon, 6 Mar 2023, 20:25 Adrian, @.***> wrote:

The only issue related to error 259 that I can find online point to the playsound module, but all the fixes are to downgrade your installation to the version 1.2.2 which is what is listed on the requirements.txt file and should already be downloaded. Maybe if you try reinstalling playsound, the script it will work. I apologize for the inconvenience. To do this, try uninstalling playsound with,

pip uninstall playsound

then run this command:

pip install -r requirements.txt

If that doesn't work, try removing it again and running this command:

pip install playsound==1.2.2

Hopefully that will fix the issue. Else to that, as far as I can tell based on what we've discussed and what I could find online, the problem has to do with how playsound is working on your OS. It's not a universal issue unfortunately, so I haven't been able to replicate it where I'm at (I'm using an up-to-date Windows 10 laptop atm) and find a fix.

These were some of the sources that I found regarding this problem:

https://stackoverflow.com/questions/69245722/error-259-on-python-playsound-unable-to-sound

https://stackoverflow.com/questions/69065485/sound-file-will-not-play-using-playsound-module-python-error-259

— Reply to this email directly, view it on GitHub https://github.com/Adri6336/gpt-voice-conversation-chatbot/issues/2#issuecomment-1457205465, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQUWAS7UHAMGERLDDAIEQG3W2ZW7ZANCNFSM6AAAAAAVQK3IEM . You are receiving this because you are subscribed to this thread.Message ID: @.*** com>

Gottano commented 1 year ago

Alright, so I uninstalled playsound and installed requirements.txt, but got the same error. Then I installed pip install playsound==1.2.2 and it is working fine now...thanks for all your help!

Now, another thing.

It can sometimes take up to 30 seconds to go from question to hearing the answer. The bottle neck seems to be with google's SST.... it sometimes kinda "hangs" for ages.

Also, ElevenLabs is super expensive. I find that Amazon Poly TTS is almost as good (free for one year for new accounts and much cheaper than ElevenLabs after that).

How would I go about adapting your code for it to use Amazon Polly for TTS and AWS transcribe for STT? (AWS STT gives you 60 mins for free per month for new accounts).

I'm asking because I do not have the knowledge of how to do the above, but I'm stoked about this project!

Thanks

On Mon, 6 Mar 2023 at 20:53, IG @.***> wrote:

Ok thanks a lot. Will try your suggestion 😉

On Mon, 6 Mar 2023, 20:25 Adrian, @.***> wrote:

The only issue related to error 259 that I can find online point to the playsound module, but all the fixes are to downgrade your installation to the version 1.2.2 which is what is listed on the requirements.txt file and should already be downloaded. Maybe if you try reinstalling playsound, the script it will work. I apologize for the inconvenience. To do this, try uninstalling playsound with,

pip uninstall playsound

then run this command:

pip install -r requirements.txt

If that doesn't work, try removing it again and running this command:

pip install playsound==1.2.2

Hopefully that will fix the issue. Else to that, as far as I can tell based on what we've discussed and what I could find online, the problem has to do with how playsound is working on your OS. It's not a universal issue unfortunately, so I haven't been able to replicate it where I'm at (I'm using an up-to-date Windows 10 laptop atm) and find a fix.

These were some of the sources that I found regarding this problem:

https://stackoverflow.com/questions/69245722/error-259-on-python-playsound-unable-to-sound

https://stackoverflow.com/questions/69065485/sound-file-will-not-play-using-playsound-module-python-error-259

— Reply to this email directly, view it on GitHub https://github.com/Adri6336/gpt-voice-conversation-chatbot/issues/2#issuecomment-1457205465, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQUWAS7UHAMGERLDDAIEQG3W2ZW7ZANCNFSM6AAAAAAVQK3IEM . You are receiving this because you are subscribed to this thread.Message ID: @.*** com>

Adri6336 commented 1 year ago

Noice, I'm happy to hear its working now! I wonder why the requirements text wasn't installing properly, imma have to look into it to make sure it works well for others.

I haven't heard about Amazon Polly, I'll def have to check that out! I do plan on adding a feature where you can swap voices better (e.g. change ElevenLabs voice, use different models, etc.), but that will take some time probably.

If you want to mod the bot for the time being to use Amazon Polly, your best bet will be to figure out how to use the API for that and modify the function in chatbot.py called "tts11AI". If they have code with curl in it, and you don't know how to implement it, give the code to GPT and ask it to convert it into python code. It should be able to help you greatly to this end.

Adri6336 commented 1 year ago

Also once you modify that function properly, you should be able to use the amazon key in the elevenlabs place in keys.txt. Just be sure to not alter the text in front of the equals sign

Gottano commented 1 year ago

Ok cool beans!

On Tue, 7 Mar 2023, 16:12 Adrian, @.***> wrote:

Also once you modify that function properly, you should be able to use the amazon key in the elevenlabs place in keys.txt. Just be sure to not alter the text in front of the equals sign

— Reply to this email directly, view it on GitHub https://github.com/Adri6336/gpt-voice-conversation-chatbot/issues/2#issuecomment-1458688041, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQUWAS63WNERWJTYDOV42V3W26CBTANCNFSM6AAAAAAVQK3IEM . You are receiving this because you are subscribed to this thread.Message ID: @.*** com>