xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
5.19k stars 421 forks source link

TTS curl Internal Server Error #1995

Closed tsiens closed 2 months ago

tsiens commented 2 months ago

System Info / 系統信息

cuda12.1

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

Version info / 版本信息

0.13.3

The command used to start Xinference / 用以启动 xinference 的命令

docker

Reproduction / 复现过程

Xinference python success but curl error

curl -X 'POST' \
>   'http://127.0.0.1:9997/v1/audio/speech' \
>   -H 'accept: application/json' \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "model": "ChatTTS",
>     "text": "你好",
>     "voice": "中文女"
>   }'
Internal Server Error
curl -X 'POST' \
>   'http://127.0.0.1:9997/v1/audio/speech' \
>   -H 'accept: application/json' \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "model": "CosyVoice-300M-SFT",
>     "text": "你好",
>     "voice": "中文女"
>   }'
Internal Server Error

Expected behavior / 期待表现

return result as

curl -X 'POST' \
>   'http://127.0.0.1:9997/v1/chat/completions' \
>   -H 'accept: application/json' \
>   -H 'Content-Type: application/json' \
>   -d '{
>     "model": "qwen2-instruct",
>     "messages": [
>         {
>             "role": "system",
>             "content": "You are a helpful assistant."
>         },
>         {
>             "role": "user",
>             "content": "What is the largest animal?"
>         }
>     ],
>     "max_tokens": 512,
>     "temperature": 0.7
>   }'
{"id":"chatfad8c3cd-6e0b-49a8-b4c8-f082362712f9","object":"chat.completion","created":1722499076,"model":"qwen2-instruct","choices":[{"index":0,"message":{"role":"assistant","content":"The largest animal is the blue whale (Balaenoptera musculus). Adult blue whales can grow up to 100 feet (30 meters) in length and weigh as much as 200 tons (about 400,000 pounds or 181,437 kilograms). They live in all of the world's oceans, but their populations were severely depleted by commercial whaling activities that historically targeted them. Despite being listed as an endangered species globally since the early 1970s, some populations have shown signs of recovery due to conservation efforts and restrictions on whaling. Blue whales feed primarily on krill, which they catch by opening their mouths wide and engulfing large volumes of water rich with prey particles before filtering it out through comb-like baleen plates hanging from their upper jaws."},"finish_reason":"stop"}],"usage":{"prompt_tokens":25,"completion_tokens":172,"total_tokens":197}}
qinxuye commented 2 months ago

Do you have the error stack on the server side?

tsiens commented 2 months ago

Do you have the error stack on the server side?

No output at all

qinxuye commented 2 months ago

@codingl2k1 Please take look at this issue.

codingl2k1 commented 2 months ago

@codingl2k1 Please take look at this issue.

OK

codingl2k1 commented 2 months ago

Please try this curl:

curl -X 'POST' \
   'http://127.0.0.1:9997/v1/audio/speech' \
   -H 'accept: application/json' \
   -H 'Content-Type: application/json' \
   -d '{
     "model": "CosyVoice-300M-SFT",
     "input": "你好",
     "voice": "中文女"
   }'

If you want to make a request with binary voice, it should be in the form of a request. Please refer to the Xinference client: https://github.com/xorbitsai/inference/blob/main/xinference/client/restful/restful_client.py#L764

leslie2046 commented 2 months ago

stream参数去除即可