huggingface / text-generation-inference

Large Language Model Text Generation Inference
http://hf.co/docs/text-generation-inference
Apache License 2.0
8.85k stars 1.04k forks source link

when run llama using TGI, Server error #1712

Closed QianguoS closed 4 months ago

QianguoS commented 5 months ago

System Info

最新版本docker

Information

Tasks

Reproduction

1、Docker starts without any errors. 2、An error occurs during the request inference process.

Expected behavior

When I run concurrent requests on the TGI framework, the following error occurs:

ESC[2m2024-04-07T02:20:36.588030ZESC[0m ESC[31mERRORESC[0m ESC[1mgenerate_streamESC[0mESC[1m{ESC[0mESC[3mparametersESC[0mESC[2m=ESC[0mGenerateParameters { best_of: None, temperature: Some(0.8), repetition_penalty: None, frequency_penalty: Some(0.0), top_k: None, top_p: Some(0.96), typical_p: None, do_sample: true, max_new_tokens: Some(4096), return_full_text: None, stop: [], truncate: None, watermark: false, details: false, decoder_input_details: false, seed: None, top_n_tokens: None, grammar: None }ESC[1m}ESC[0mESC[2m:ESC[0mESC[1masync_streamESC[0mESC[2m: ESC[0mESC[1mgenerate_streamESC[0mESC[2m:ESC[0mESC[1minferESC[0mESC[2m:ESC[0mESC[1msend_errorESC[0mESC[2m:ESC[0m ESC[2mtext_generation_router::inferESC[0mESC[2m:ESC[0m ESC[2mrouter/src/infer.rsESC[0mESC[2m:ESC[0mESC[2m705:ESC[0m Request failed during generation: Server error: error trying to connect: No such file or directory (os error 2)

suparious commented 5 months ago

What command do you use to start the docker image?

What version of CUDA are you using? Did you enable the CRI driver from the nvidia-docker-toolkit?

Narsil commented 5 months ago

Make sure you're using the command from the readme:

https://github.com/huggingface/text-generation-inference?tab=readme-ov-file#docker

It contains important flags.

github-actions[bot] commented 4 months ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.