langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
52.51k stars 7.67k forks source link

can't connect an audio file in speech to text tool #11046

Open lucaseatp opened 2 hours ago

lucaseatp commented 2 hours ago

Self Checks

Dify version

0.11.2

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

  1. connect the tool "speech to text" in your workflow
  2. try to create a variable for audio or video
  3. try to connect the custom variable or sys.files to speech to text tool
  4. tool doesn't show any variable available, can't move forward transcribing the audio

✔️ Expected Behavior

choose an audio/video variable to convert with whisper or other model

❌ Actual Behavior

can't choose any variable

dosubot[bot] commented 2 hours ago

I found a similar unsolved discussion that might be relevant to your issue: Error in item: Invalid variable, which was last updated on November 21, 2024 [1].

Regarding your issue with connecting an audio file in the 'speech to text' tool, ensure that the audio file format is supported. The supported formats are flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, and webm [2]. If your file is in one of these formats and the issue persists, it might be related to how variables are being recognized or connected in the workflow. Unfortunately, I couldn't find specific details on variable connection issues in the current version. You might want to check if there are any updates or patches that address this problem.

To continue talking to Dosu, mention @dosu.