langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
45.3k stars 6.36k forks source link

voice is default #7330

Closed 401557122 closed 1 week ago

401557122 commented 4 weeks ago

Self Checks

Dify version

0.7.0

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

I deployed CosyVoice-300M-SFT UID and CosyVoice-300M-SFT on xinference, and configured the model name as CosyVoice. I chose voice on the web, but the voice passed into xinference has always been default

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

dosubot[bot] commented 4 weeks ago

To resolve the issue where the voice selected on the web interface is not being passed into xinference and instead the default voice is always used, ensure that the selected voice is correctly passed through the entire flow from the web interface to the _invoke method and then to _tts_invoke_streaming.

  1. Check the _invoke Method: Ensure the voice parameter is correctly passed to the _invoke method:

    def _invoke(self, model: str, tenant_id: str, credentials: dict, content_text: str, voice: str, user: Optional[str] = None):
       return self._tts_invoke_streaming(model, credentials, content_text, voice)
  2. Check the _tts_invoke_streaming Method: Ensure the voice parameter is correctly passed to the _tts_invoke_streaming method:

    def _tts_invoke_streaming(self, model: str, credentials: dict, content_text: str, voice: str) -> any:
       handle = RESTfulAudioModelHandle(credentials['model_uid'], credentials['server_url'], auth_headers={})
       model_support_voice = [x.get("value") for x in self.get_tts_model_voices(model=model, credentials=credentials)]
       if not voice or voice not in model_support_voice:
           voice = self._get_model_default_voice(model, credentials)
       response = handle.speech(input=content_text.strip(), voice=voice, response_format="mp3", speed=1.0, stream=False)
       for i in range(0, len(response), 1024):
           yield response[i:i + 1024]
  3. Ensure Correct Configuration: Ensure your Dify container can access the Xinference endpoint. Use the host IP address instead of localhost in the Dify settings:

    In `Settings > Model Providers > Xinference`, enter:
    * Model name: `CosyVoice`
    * Server URL: `http://<Machine_IP>:9997` **Replace with your machine IP address**
    * Model UID: `CosyVoice-300M-SFT`

By ensuring the voice parameter is correctly set and passed through the system, and configuring the network settings properly, the selected voice should be used instead of the default voice [1][2].

To continue talking to Dosu, mention @dosu.

401557122 commented 4 weeks ago

I have modified the code of the Docker API to add print ('voice '), but Docker logs do not display. Will Flask automatically load the modifications or is there any problem @dosu

dosubot[bot] commented 4 weeks ago

Flask can automatically reload modifications if it is running in debug mode. To ensure Flask reloads your code changes in a Docker container, you need to set the DEBUG environment variable to true.

In your entrypoint.sh script, Flask is run with the --debug flag if the DEBUG environment variable is set to true:

if [[ "${DEBUG}" == "true" ]]; then
  exec flask run --host=${DIFY_BIND_ADDRESS:-0.0.0.0} --port=${DIFY_PORT:-5001} --debug
else
  exec gunicorn \
    --bind "${DIFY_BIND_ADDRESS:-0.0.0.0}:${DIFY_PORT:-5001}" \
    --workers ${SERVER_WORKER_AMOUNT:-1} \
    --worker-class ${SERVER_WORKER_CLASS:-gevent} \
    --timeout ${GUNICORN_TIMEOUT:-200} \
    --preload \
    app:app
fi

To enable automatic reloading, set the DEBUG environment variable to true in your Docker setup. You can do this by adding the following line to your .env file:

DEBUG=true

Then, restart your Docker containers to apply the changes:

docker compose down
docker compose up -d

This should enable Flask's auto-reloading feature, and your modifications should be reflected in the Docker logs [1].

To continue talking to Dosu, mention @dosu.

401557122 commented 4 weeks ago

Is there a performance advantage between gunicorn and debug @dosu

dosubot[bot] commented 4 weeks ago

Yes, there is a significant performance advantage when using gunicorn compared to running Flask in debug mode.

  1. Concurrency and Request Handling:

    • Flask Debug Mode: Handles requests in a single-threaded, synchronous manner, meaning it can only process one request at a time. This is suitable for development but not for production due to its lack of concurrency and performance limitations.
    • Gunicorn: A production-grade WSGI server that can handle multiple requests concurrently. It uses multiple worker processes or threads, depending on the configuration, which allows it to handle a higher load and provide better performance.
  2. Worker Management:

    • Flask Debug Mode: Does not support multiple workers, limiting its ability to handle concurrent requests.
    • Gunicorn: Can be configured with multiple workers and different worker classes (e.g., gevent for asynchronous handling), improving its ability to manage and process multiple requests simultaneously.
  3. Timeouts and Preloading:

    • Flask Debug Mode: Does not have advanced timeout and preloading configurations.
    • Gunicorn: Supports setting timeouts and preloading the application code before forking worker processes, which can reduce memory usage and improve performance.

In summary, gunicorn provides better performance, scalability, and reliability for production environments compared to Flask's debug mode [1].

To continue talking to Dosu, mention @dosu.