FlowiseAI / Flowise

Drag & drop UI to build your customized LLM flow
https://flowiseai.com
Apache License 2.0
31.29k stars 16.29k forks source link

[BUG] Error on uploading media file via API #3202

Closed xbiggyl closed 1 month ago

xbiggyl commented 1 month ago

Describe the bug Uploading media files via the API is returning the following error:

Error: predictionsServices.buildChatflow - 400 Invalid file format. Supported formats: ['flac', 'm4a', 'mp3', 'mp4', 'mpeg', 'mpga', 'oga', 'ogg', 'wav', 'webm']'

I have SPEECH TO TEXT via whispeer enabled in the chatflow settings, and recording directly in the flowise chatflow chat window works perfectly

To Reproduce Enable SPEECH TO TEXT in your chatflow (I am using whisper)

Send a HTTP POST request to your chatflow endpoint to upload a media file (in my case I'm trying an ogg file).

The docs explain that the JSON in the body of the POST request must look like this:


    "uploads": [
      {
        "data": "data:audio/ogg;codecs=opus;base64,VDJkblV3QUNBQUFBQUFBQUFBQUFBQUFBQUF", // The base64 string is clipped for readability.
        "type": "audio",
        "name": "3371a048d05b38f5680bb28a3b4944fb.ogg",
        "mime": "audio/ogg"
      }
    ]
  }```

Even following the[ example in the docs](https://docs.flowiseai.com/using-flowise/uploads#audio) to the letter also results in the same error as well.

Tried in Python/JS/Postman give the same error.

**Expected behavior**
The media file should be uploaded and processed using Speech to Text.

**Setup**

-   Installation [docker]
-   Flowise Version [2.0.7]
-   OS: [Linux]
-   Browser [Chrome]
xbiggyl commented 1 month ago

I was able to get it to work by setting the "name" to the m4a extension for all filetypes. I tried different filetypes (wav, mp4, ogg) they only work if you set the extension the name to m4a.

[...]
"type" = "audio",
"name" = "XXXXXXXXXXXX.m4a" // only m4a works, regardless of the original codec that has been enccoded into base64
"mime" = "audio/wav"
[...]
HenryHengZJ commented 1 month ago

That's strange, we didnt have any hardcoded check, it should works with other audio type files. Glad you figured out! Closing for now