Closed muddi900 closed 1 year ago
were you able to find a solution? i am getting the same error. it works very well in jupyter notebook app. but I keep getting this error in the hugging face application.
I have found a workaround using replicate's implementation. It requires exposing a link to a file because replicate only works with hyperlinks. I am hoping the issue would be resolved by the time I am going live.
If you are testing on local, you can use ngrok for the file link.
On Fri, Mar 24, 2023 at 4:28 PM Sumeyye Yegen @.***> wrote:
were you able to find a solution? i am getting the same error. it works very well in jupyter notebook app. but I keep getting this error in the hugging face application.
— Reply to this email directly, view it on GitHub https://github.com/openai/openai-python/issues/333#issuecomment-1482654291, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI2VVZR4CCOS2UW4WJYL4WTW5WAMNANCNFSM6AAAAAAWDXNA24 . You are receiving this because you authored the thread.Message ID: @.***>
I've run into this today as well, but with the webm audio format, also using Flask. A web app is recording a brief spoken audio clip with mimeType 'audio/webm;codecs=opus'
and when I save a copy of the recorded audio to a file, it is submitted to the API and correctly transcribed without issue. However, if I submit the request via my Flask app, I get the same error as @muddi900.
ffprobe info:
ffprobe version 5.1.2 Copyright (c) 2007-2022 the FFmpeg developers
built with Apple clang version 14.0.0 (clang-1400.0.29.202)
configuration: --prefix=/usr/local/Cellar/ffmpeg/5.1.2_6 --enable-shared --enable-pthreads --enable-version3 --cc=clang --host-cflags= --host-ldflags= --enable-ffplay --enable-gnutls --enable-gpl --enable-libaom --enable-libaribb24 --enable-libbluray --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-librav1e --enable-librist --enable-librubberband --enable-libsnappy --enable-libsrt --enable-libsvtav1 --enable-libtesseract --enable-libtheora --enable-libvidstab --enable-libvmaf --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxml2 --enable-libxvid --enable-lzma --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-libopencore-amrnb --enable-libopencore-amrwb --enable-libopenjpeg --enable-libspeex --enable-libsoxr --enable-libzmq --enable-libzimg --disable-libjack --disable-indev=jack --enable-videotoolbox
libavutil 57. 28.100 / 57. 28.100
libavcodec 59. 37.100 / 59. 37.100
libavformat 59. 27.100 / 59. 27.100
libavdevice 59. 7.100 / 59. 7.100
libavfilter 8. 44.100 / 8. 44.100
libswscale 6. 7.100 / 6. 7.100
libswresample 4. 7.100 / 4. 7.100
libpostproc 56. 6.100 / 56. 6.100
Input #0, matroska,webm, from 'sample.webm':
Metadata:
encoder : Chrome
Duration: N/A, start: 0.000000, bitrate: N/A
Stream #0:0(eng): Audio: opus, 48000 Hz, mono, fltp (default)
OS: Mac 12.6.3 Python version: 3.10.1 openai version: 0.27.2
I was getting this error on some files, and while looking at them, I noticed that they are lacking a proper mp3 header. My solution was to let them run through ffmpeg
once before uploading them, by using the acodec='copy'
parameter so that the actual audio content of the mp3 file does not get modified:
ffmpeg \
.input(path) \
.output('_temp.mp3', acodec='copy') \
.overwrite_output() \
.run()
and then I'm uploading _temp.mp3
instead of path
.
Well the file works fine for me when I use it as a local file. It is only when the file is uploaded server side that it is the issue.
While my current workaround uses lical storage, in production it would be unfeasible. The Flask backend will probably be hosted on a ephemeral system.
On Tue, Mar 28, 2023, 6:38 AM Daniel Faust @.***> wrote:
I was getting this error on some files, and while looking at them, I noticed that they are lacking a proper mp3 header. My solution was to let them run through ffmpeg once before uploading them, by using the acodec='copy' parameter so that the actual audio content of the mp3 file does not get modified:
ffmpeg \ .input(path) \ .output('_temp.mp3', acodec='copy') \ .overwrite_output() \ .run()
and then I'm uploading _temp.mp3 instead of path.
— Reply to this email directly, view it on GitHub https://github.com/openai/openai-python/issues/333#issuecomment-1486706147, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI2VVZRTGIEZ2WRRSJACPRLW6LETJANCNFSM6AAAAAAWDXNA24 . You are receiving this because you were mentioned.Message ID: @.***>
Also got same error when loading audio files locally.
audio2 = open("Greeti.mp3", "rb")
sub = openai.Audio.transcribe("whisper-1", audio2, response_format = "text")
'Greetings.\n'
Changing the response formats to a different string returns errors.
sub = openai.Audio.transcribe("whisper-1", audio2, response_format = "srt")
openai.error.InvalidRequestError: Invalid file format. Supported formats: ['m4a', 'mp3', 'webm', 'mp4', 'mpga', 'wav', 'mpeg']
I fought with this for a long time. Finally got it working by not using the MediaRecorder()
API on the frontend. I switched to using
const startRecording = () => {
setIsRecording(true)
navigator.mediaDevices.getUserMedia({ audio: true }).then((stream) => {
const options = {
type: 'audio',
mimeType: 'audio/mp3',
numberOfAudioChannels: 1,
recorderType: RecordRTC.StereoAudioRecorder,
checkForInactiveTracks: true,
timeSlice: 5000,
ondataavailable: (blob) => {
socket.emit('audio', { buffer: blob })
},
}
const recordRTC = new RecordRTC(stream, options)
setRecorder(recordRTC)
recordRTC.startRecording()
})
}
and it worked immediately.
how to dell it in server side ? it complain about the missing name of file?
any updates on this? the request fails inside an endpoint, but works when on local files
I have a variation on the solution using RecordRTC which was posted above. It shows how to use start/stop, reset the audio channel, send with Ajax request (multi-part form data)
if (navigator.mediaDevices && navigator.mediaDevices.getUserMedia) {
navigator.mediaDevices.getUserMedia({ audio: true })
.then((stream) => {
const options = {
type: 'audio',
mimeType: 'audio/mp3',
numberOfAudioChannels: 1,
recorderType: RecordRTC.StereoAudioRecorder
}
const recordRTC = new RecordRTC(stream, options);
$(document).on("click", "#record-button", () => {
let recordButton = $("#record-button");
// already recording, hit stop
if(recordButton.attr("recording") === "true") {
recordButton.attr("recording", false)
recordButton.html("REC");
recordRTC.stopRecording(async () => {
let blob = await recordRTC.getBlob();
var form = new FormData();
form.append("file", blob);
$.ajax({
type: "POST",
data: form,
url: "",
processData: false,
contentType: false,
success: function (data) {
// ...
recordRTC.reset();
},
error: (err) => {
// ...
recordRTC.reset();
}
});
});
}
// not recording, hit play
else {
//mediaRecorder.start();
recordButton.attr("recording", true);
recordButton.html("STOP");
recordRTC.startRecording();
}
});
})
// Error callback
.catch((err) => {
console.error(`The following getUserMedia error occurred: ${err}`);
});
}
Having the same issue with a mp3 file, written by ffmpeg, lame mp3
Try with this code:
with tempfile.NamedTemporaryFile(suffix='.mp3') as temp_file:
temp_file.write(audio_file)
temp_file.flush()
temp_file.seek(0)
transcript_read = openai.Audio.transcribe("whisper-1", temp_file)
For me specifically it was on iPhone, I was saving a valid .wav file (was working when I tested it) then I used a file type detector tool to find out it was actually some other file format that apple was saving it to, you can either convert to and from file types using node library ffmpeg or for iphone specifically save it as a .m4a file instead of .wav
was fighting with similar problems. Works currently on iPhone 13(iOS 17) with MediaRecorder,
Client:
...
const recorder = new MediaRecorder(stream, { mimeType: 'audio/mp4' });
...
Server:
...
const formData = new FormData();
formData.append('file', buffer, { filename: "audio.mp4", contentType: "audio/mp4" });
...
const response = await axios.post('https://api.openai.com/v1/audio/transcriptions', formData, {
headers: {
'Authorization': `Bearer ${process.env.OPENAI_API_KEY}`,
...formData.getHeaders(),
},
});
For Chrome & co. I go with 'audio/webm' on client and 'audio/mp3' on server.
Don't really expect this to be bulletproof, but for the time being its stable enough for my needs.
BR, Adrian
Hi, I think I found a viable workaround. The issue seems to be in the way the BufferReader is reading files. Bypassing the bufferreader fixed it for me.
Can't investigate more because of the outtage unfortunately. Might be related to #727
In this line if instead of file=buffer_reader
I put file=open(args.file, "rb")
the function returns normally instead of 400 error.
edit: found a fix and submitted a PR in #733
OpenAI audio endpoints generally require a filename with extension be provided in the upload, which is used to determine the file type.
This is made more convenient with the new v1 of the SDK, where you can pass a pathlib.Path
to the API:
from pathlib import Path
from openai import OpenAI
openai = OpenAI()
speech_file_path = Path(__file__).parent / "Downloads" / "sample.mp3"
openai.audio.transcriptions.create(model='whisper-1', file=file)
Describe the bug
Hello
I am trying to integrate the whisper API into my Flask app. However I get the following error when I input the received file from the flask endpoint, I get the following error:
However, loading the file in the interactive console works fine.
To Reproduce
openai.Audio.transcribe
method through 'request.files[fileName].stream.read()`.Code snippets
the FFprobe info of the file:
OS
Windows 11
Python version
Python v10.5
Library version
0.27.2