Closed mosnicholas closed 1 year ago
I can report a similar issue. The transcript I'm getting is:
It's like a, uh, it's like a, Oblivion Charm Do you know what that is? It's a, uh, um, what's that called? What's that called? It's like a, um, a, um, gum... What's that called? Huh? It's a gum... A gum... Okay, now I'd like to let you know, So actually, I have a feature... So if you go to Amazon,
Meanwhile, the original audio contains a speech about analyzing a customer's skin, determining their skin type, and recommending the appropriate products and treatments.
Even with the last version (v.4.0) I have this problem with recorded audio from Safari (MacOs & iOS). The mp4 file is correctly uploaded to the server and can reproduce too, but the transcription always is "Subtítulos realizados por la comunidad de Amara.org". In Windows (chrome & firefox) work fine, in MacOS with Chrome it's ok too. In MacOS with Safari & iOS (Chrome & Safari) always say: "Subtítulos realizados por la comunidad de Amara.org"
Thanks for reporting! Issues on this repo are intended for bugs in the library; please report bugs in the API to https://community.openai.com/c/api/bugs/30
Describe the bug
I’m using whisper through NextJS. I’m calling the API directly, given that the openai-node package doesn’t have great support for the whisper API ([Whisper] cannot call
createTranscription
function from Node.js due to File API · Issue #77 · openai/openai-node · GitHub).I’m calling it like so:
I’m running into a lot of issues using the prompt. It is usually a two sentence prompt that has (1) a high level overview of the transcript (it is a call about xyz), and (2) some proper nouns and competitor names that I want to make sure a spelt correctly. This follows the example provided in OpenAI’s docs. An example of the type of prompt I'm using is:
This transcript is about Bayern Munich and various soccer teams. Common competitors in the league include Real Madrid, Barcelona, Manchester City, Liverpool, Paris Saint-Germain, Juventus, Chelsea, Borussia Dortmund, and AC Milan.
80% of the time I use the prompt, however, I get garbage, hallucinated, output. It ends up on a loop, repeating the same thing (eg. a competitor’s name, or a url made up from one of the competitor’s names). An example of the output I'm seeing:
In the past, the league has been a place of competition for the players. The league has been a place of competition for the players. The league has been a place of competition for the players. The league has been a place of competition for the players. The league has been a place of competition for the players. The league has been a place of competition for the players....
Note, this sometimes works better when calling the API directly from the terminal with the prompt.
To Reproduce
Download an audio file, and input a prompt like the one I shared. Example of the output I see:
Code snippets
No response
OS
macOS
Node version
v19.9.0
Library version
3.2.1