irudnyts / openai

An R package-wrapper around OpenAI API
https://irudnyts.github.io/openai/
Other
164 stars 28 forks source link

Whisper interface not working in Spanish #49

Closed dietrichson closed 11 months ago

dietrichson commented 11 months ago

I was running a test to compare whisper to google transcript. This was for an audio file in Spanish so called like this:

transcription <- create_transcription(file = "sound-data/29320.mp3",
                                   language = "es",
                                   model = "whisper-1"
                                   )
})

What I get back is:

"Más temprano que tarde. Cuestión Política. Daniel Ciaschetti, doctor en Ciencia Política, docente e investigador de la Universidad de la República. Daniel Ciaschetti, doctor en Ciencia Política, doctor en Ciencia Política, doctor en Ciencia Política, doctor en Ciencia Política, doctor en Ciencia Política,

and so on for a couple of hundred lines. It seems it transcribes the first Sentence, and then got stuck, just repeating the last part of it for the rest of the transcript. I am not sure if this is a problem with the R-package or a bug in the API.

dietrichson commented 11 months ago

So, I did some more investigation here, and it turns out that the audio-file I was trying to transcribe had some music at the beginning, after the first two sentences or so. Once I manually cut this out the model performed as advertised. Not quite sure how to handle this in an automated workflow setting, but I suspect it is not an issue of the R-package itself, but rather a discussion to be had in the openai forums. @irudnyts I am closing this issue unless you have anything to add.