can't connect an audio file in speech to text tool

langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.

Other

52.51k stars 7.67k forks source link

Self Checks

[X] This is only for bug report, if you would like to ask a question, please head to Discussions.
[X] I have searched for existing issues search for existing issues, including closed ones.
[X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[X] [FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
[X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.11.2

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

connect the tool "speech to text" in your workflow
try to create a variable for audio or video
try to connect the custom variable or sys.files to speech to text tool
tool doesn't show any variable available, can't move forward transcribing the audio

✔️ Expected Behavior

choose an audio/video variable to convert with whisper or other model

❌ Actual Behavior

can't choose any variable

I found a similar unsolved discussion that might be relevant to your issue: Error in item: Invalid variable, which was last updated on November 21, 2024 ^[1].

Regarding your issue with connecting an audio file in the 'speech to text' tool, ensure that the audio file format is supported. The supported formats are flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, and webm ^[2]. If your file is in one of these formats and the issue persists, it might be related to how variables are being recognized or connected in the workflow. Unfortunately, I couldn't find specific details on variable connection issues in the current version. You might want to check if there are any updates or patches that address this problem.

To continue talking to Dosu, mention @dosu.

langgenius / dify