Mintplex-Labs / anything-llm

The all-in-one Desktop & Docker AI application with built-in RAG, AI agents, and more.
https://anythingllm.com
MIT License
27.47k stars 2.77k forks source link

[BUG]: OpenAiWhisper always leads a transcribing failure #2617

Closed fiyen closed 1 week ago

fiyen commented 1 week ago

How are you running AnythingLLM?

Local development

What happened?

Once a Transcription API is set to OpenAiWhisper, transcription result is failure whenever I upload a .mp3 file. I find that the code in line 36 of OpenAiWhisper.js in collector directory is return a result like: content: response. I have referred to the docs in openai.com, the example shows that the text content should extract by using response.text. So I change the code to content: response.text, and my question is solved.

Are there known steps to reproduce?

Just change the transcription API to open whisper and upload a .mp3 file in your workshop.

timothycarambat commented 1 week ago

I can confirm that PR #2618 actually introduces the bug that the associated PR is alleviating. We specifically return text and not JSON - so the response body is the text https://github.com/Mintplex-Labs/anything-llm/blob/e41a9beaaeb30d1b0f4feacdb5ccd7647b71eb7c/collector/utils/WhisperProviders/OpenAiWhisper.js#L25

The only ways this behavior could be occurring is that you are on a modified version of AnythingLLM or are using a proxy or host override that instead maps to an OpenAI compatible service that is actually not fully 1:1 and always returns type json

timothycarambat commented 1 week ago

@fiyen We just merged in a slight modification of this PR which should support your use case as well as not break compatibility with the current implementation of OpenAI in AnythingLLM #2621