langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
51.01k stars 7.34k forks source link

chattts cannot autoplay #6997

Closed tsiens closed 4 weeks ago

tsiens commented 3 months ago

Self Checks

Dify version

0.6.15

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

1、Add tts to workflow functionality,And turn on autoplay image 2、Running tasks in the workflow,chattts cannot autoplay,And I didn't see any tts requests image

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

dosubot[bot] commented 3 months ago

The text-to-speech (TTS) feature in your self-hosted Dify version 0.6.15 (Docker) is not autoplaying because FFmpeg must be installed on the system for the TTS function to work properly.

To resolve this issue, ensure that FFmpeg is installed on your system. You can refer to the installation guide provided in the documentation:

If you need to use OpenAI TTS, FFmpeg must be installed on the system for it to function properly. For more details, refer to: Link [1].

Additionally, ensure that the TTS feature is properly enabled in your workflow settings. The AudioService class in the Dify codebase includes a method transcript_tts that checks if TTS is enabled in the workflow settings. If TTS is not enabled, it raises a ValueError with the message "TTS is not enabled". This check is performed during the invocation of the TTS function, ensuring that the feature is enabled and configured correctly [2].

Here is a relevant snippet from the AudioService class:

@classmethod
def transcript_tts(cls, app_model: App, text: str, streaming: bool,
                   voice: Optional[str] = None, end_user: Optional[str] = None):
    if app_model.mode in [AppMode.ADVANCED_CHAT.value, AppMode.WORKFLOW.value]:
        workflow = app_model.workflow
        if workflow is None:
            raise ValueError("TTS is not enabled")

        features_dict = workflow.features_dict
        if 'text_to_speech' not in features_dict or not features_dict['text_to_speech'].get('enabled'):
            raise ValueError("TTS is not enabled")

        voice = features_dict['text_to_speech'].get('voice') if voice is None else voice
    else:
        text_to_speech_dict = app_model.app_model_config.text_to_speech_dict

        if not text_to_speech_dict.get('enabled'):
            raise ValueError("TTS is not enabled")

        voice = text_to_speech_dict.get('voice') if voice is None else voice

Ensure both FFmpeg is installed and the TTS feature is enabled in your workflow settings to get the TTS feature working properly [3].

For detailed steps on installing FFmpeg on different operating systems, refer to the following:

Windows:

  1. Visit FFmpeg official website and download the compiled Windows shared library.
  2. Extract the downloaded folder and move it to a desired location, e.g., C:\Program Files\.
  3. Add the bin directory of the extracted folder to the system environment variables.
  4. Open Command Prompt and type ffmpeg -version to verify the installation.

Ubuntu:

  1. Open Terminal.
  2. Run sudo apt-get update and then sudo apt-get install ffmpeg.
  3. Verify the installation by typing ffmpeg -version.

CentOS:

  1. Enable EPEL repository: sudo yum install epel-release.
  2. Run sudo rpm -Uvh http://li.nux.ro/download/nux/dextop/el7/x86_64/nux-dextop-release-0-5.el7.nux.noarch.rpm.
  3. Update yum packages: sudo yum update.
  4. Install FFmpeg: sudo yum install ffmpeg ffmpeg-devel.
  5. Verify the installation by typing ffmpeg -version.

Mac OS X:

  1. Open Terminal.
  2. If Homebrew is not installed, install it using: /bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)".
  3. Install FFmpeg using Homebrew: brew install ffmpeg.
  4. Verify the installation by typing ffmpeg -version [4].

To continue talking to Dosu, mention @dosu.

crazywoola commented 3 months ago

You need to publish it and create a new conversation to make it work.

tsiens commented 3 months ago

You need to publish it and create a new conversation to make it work.

i publish it and create a new conversation,but it doesn't play automatically.I still need to manually click the play button image

crazywoola commented 3 months ago

@ic-xu Could you take a look at this?

Link https://github.com/langgenius/dify/issues/7003

ic-xu commented 3 months ago

Can you check if there are any messages similar to 'tts-message' in the 'chat-message' interface?

tsiens commented 3 months ago

Can you check if there are any messages similar to 'tts-message' in the 'chat-message' interface?

only chat-message.I still need to manually click the play button

greycodee commented 2 months ago

I'm trying to fix it.

greycodee commented 2 months ago

I'm not reproducing it locally I can play the voice automatically on my locally.

Freya0214 commented 2 months ago

hi,"I encountered the same issue and found that the 'audio' field in the client chat-message response is empty. My Dify version is 0.70, and my Xinference version is 0.14.3." WX20240827-175549@2x

Freya0214 commented 2 months ago

I noticed that in my chat-message, between message_end and tts_message_end, there seems to be a missing tts_message response data."

data: {"event": "message_end", "conversation_id": "4b36121c-d9bb-4bf2-86d3-b86c624d2b62", "message_id": "fc4f6ed4-e50d-418d-a779-fe15cb44155c", "created_at": 1724728736, "task_id": "1652232e-38d5-4de3-ada9-1fed6c855093", "id": "fc4f6ed4-e50d-418d-a779-fe15cb44155c", "metadata": {"usage": {"prompt_tokens": 7521, "prompt_unit_price": "0.0", "prompt_price_unit": "0.0", "prompt_price": "0.0", "completion_tokens": 1183, "completion_unit_price": "0.0", "completion_price_unit": "0.0", "completion_price": "0.0", "total_tokens": 8704, "total_price": "0.0", "currency": "USD", "latency": 38.93568039796082}}}

data: {"event": "tts_message_end", "conversation_id": "4b36121c-d9bb-4bf2-86d3-b86c624d2b62", "message_id": "fc4f6ed4-e50d-418d-a779-fe15cb44155c", "created_at": 1724728736, "task_id": "1652232e-38d5-4de3-ada9-1fed6c855093", "audio": ""}