langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
36.49k stars 4.94k forks source link

Text-to-speech feature bug. Audio can't be played on iOS devices. #5843

Open annalihan opened 5 days ago

annalihan commented 5 days ago

Self Checks

Dify version

0.6.11

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

image

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

dosubot[bot] commented 5 days ago

Hello @annalihan! I'm here to assist you with any bugs, questions, or contributions for the Dify repository. If you need any help, feel free to ask!

The text-to-speech feature fails to play audio on iOS devices in version 0.6.11 of the self-hosted (Docker) application because ffmpeg is not installed. The error message [openai] Error: ffmpeg is not installed indicates that ffmpeg is required for the text-to-speech functionality to work properly.

To resolve this issue, you need to install ffmpeg on your system. Here are the installation steps for different operating systems:

Ubuntu:

  1. Open the terminal.
  2. Enter the following commands to install FFmpeg:
    sudo apt-get update
    sudo apt-get install ffmpeg
  3. Enter ffmpeg -version to check if it has been successfully installed.

Mac OS X:

  1. Open the terminal.
  2. If you haven't installed Homebrew yet, you can install it by entering the following command in the terminal:
    /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
  3. Install FFmpeg with Homebrew:
    brew install ffmpeg
  4. Enter ffmpeg -version to check if it has been successfully installed.

Windows:

  1. Visit the FFmpeg official website and download the precompiled Windows shared library.
  2. Download and unzip the FFmpeg folder, which will generate a folder similar to "ffmpeg-20200715-51db0a4-win64-static".
  3. Move the unzipped folder to a location of your choice, for example, C:\Program Files.
  4. Add the absolute path of the FFmpeg bin directory to the system environment variables.
  5. Open the command prompt and enter ffmpeg -version to see if the FFmpeg version information is displayed, indicating successful installation.

CentOS:

  1. First, you need to enable the EPEL repository. In the terminal, enter:
    sudo yum install epel-release
  2. Then, enter:
    sudo rpm -Uvh http://li.nux.ro/download/nux/dextop/el7/x86_64/nux-dextop-release-0-5.el7.nux.noarch.rpm
  3. Update the yum package:
    sudo yum update
  4. Finally, install FFmpeg:
    sudo yum install ffmpeg ffmpeg-devel
  5. Enter ffmpeg -version to check if it has been successfully installed.

Additionally, ensure that the AudioService in your application is correctly configured to handle text-to-speech requests. The ChatMessageTextApi class in your application should be properly set up to process these requests and return the audio data. If there are any issues with the service configuration or the API endpoints, it could also lead to failures in playing audio on iOS devices [1][2].

To continue talking to Dosu, mention @dosu.

annalihan commented 5 days ago

@dosu Thank you for your assistance. I have installed ffmpeg on my system.

ffmpeg version 5.1.4-0+deb12u1 Copyright (c) 2000-2023 the FFmpeg developers
built with gcc 12 (Debian 12.2.0-14)
configuration: --prefix=/usr --extra-version=0+deb12u1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libglslang --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librist --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libsvtav1 --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --disable-sndio --enable-libjxl --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-libplacebo --enable-librav1e --enable-shared
libavutil      57. 28.100 / 57. 28.100
libavcodec     59. 37.100 / 59. 37.100
libavformat    59. 27.100 / 59. 27.100
libavdevice    59.  7.100 / 59.  7.100
libavfilter     8. 44.100 /  8. 44.100
libswscale      6.  7.100 /  6.  7.100
libswresample   4.  7.100 /  4.  7.100
libpostproc    56.  6.100 / 56.  6.100

I can play it on PC or Android devices, but not on iOS devices.

dosubot[bot] commented 5 days ago

To resolve the issue of audio not playing on iOS devices despite being able to play it on PC and Android devices, ensure that the audio file format is supported by iOS. According to the Dify code, the allowed audio file extensions are:

ALLOWED_EXTENSIONS = ['mp3', 'mp4', 'mpeg', 'mpga', 'm4a', 'wav', 'webm', 'amr']

iOS devices typically support mp3, m4a, and wav formats. Ensure that the audio files you are trying to play are in one of these formats. If the audio files are in a format not supported by iOS, convert them to a supported format.

Here is a snippet of the relevant code for checking the file extension:

extension = file.mimetype
if extension not in [f'audio/{ext}' for ext in ALLOWED_EXTENSIONS]:
    raise UnsupportedAudioTypeServiceError()

Make sure the file.mimetype corresponds to a format supported by iOS [1].

To continue talking to Dosu, mention @dosu.

annalihan commented 5 days ago

@dosu file.mimetype is wav . I changed it to mp3, it still doesn't work.

Here is my solution, call functionload(), I don't know why, but it works!!:

        const audioUrl = URL.createObjectURL(blob)
        audioRef.current!.src = audioUrl

        // Load the audio to fix 'Audio can't be played on iOS devices. '
        audioRef.current!.load()