Problem with wrong audio type

hackingthemarkets / chatgpt-api-whisper-api-voice-assistant

chatgpt api and whisper api tutorial - voice conversation with therapist

331 stars 149 forks source link

Problem with wrong audio type #4

Closed iEnki closed 1 year ago

iEnki commented 1 year ago

Hey all,

i can´t fix this Problem. ffmpeg installed finde, the audi file is saved as .wav but still this error stops my transcription. Any Ideas for this fix. Iam on windows system, maybe there are codec error?

Error: openai.error.InvalidRequestError: Invalid file format. Supported formats: ['m4a', 'mp3', 'webm', 'mp4', 'mpga', 'wav', 'mpeg']

iwmo commented 1 year ago

yeah...having same problem on MacOs

Jayqianonedream commented 1 year ago

Same to me, try both on win and mac

yinyijie commented 1 year ago

same error here.

VikasSharma707 commented 1 year ago

Solved solution go to this repo https://github.com/VikasSharma707/chatgpt-api-whisper-api-voice-assistant

Jupalaja commented 1 year ago

Hi guys, this worked for me:

from pydub import AudioSegment

audio_file_wav = open(audio, "rb")
audio_file_mp3 = AudioSegment.from_wav(audio_file_wav).export("audio.mp3", format="mp3")
transcript = openai.Audio.transcribe("whisper-1", audio_file_mp3)

I'm using pydub to convert the audio file to wav, this export the mp3 file that can be send to the Whisper API without a problem

yinyijie commented 1 year ago

Hi guys, this worked for me:
from pydub import AudioSegment

audio_file_wav = open(audio, "rb")
audio_file_mp3 = AudioSegment.from_wav(audio_file_wav).export("audio.mp3", format="mp3")
transcript = openai.Audio.transcribe("whisper-1", audio_file_mp3)
I'm using pydub to convert the audio file to wav, this export the mp3 file that can be send to the Whisper API without a problem

Thanks, it resolved the issue.

eervin123 commented 1 year ago

Was having this problem as well. I found this solution, similar concept to the one above. We are just adding an audio file extension.

https://github.com/gradio-app/gradio/issues/3479

def transcribe(audio):
    os.rename(audio, audio + '.wav')
    file = open(audio + '.wav', "rb")
    return openai.Audio.transcribe("whisper-1", file).text

hackingthemarkets commented 1 year ago

Thanks for the rename suggestion, added this to repo :). I'm guessing the API didn't initially validate the extension, but they added this after I recorded.