huggingface / text-generation-inference

Large Language Model Text Generation Inference
http://hf.co/docs/text-generation-inference
Apache License 2.0
8.34k stars 943 forks source link

ROCm: Server error: transport error when running batch size >=2 (Falcon-11B) #2043

Open almersawi opened 3 weeks ago

almersawi commented 3 weeks ago

System Info

image: text-generation-inference:sha-bf3c813-rocm GPU: AMD MI250 TGI args: --dtype float16 --model-id tiiuae/falcon-11B

PS. tested on meta-llama/Llama-2-7b-hf, no issues

Information

Tasks

Reproduction

Expected behavior

TGI should handle different batch size on ROCm

almersawi commented 3 weeks ago

Same issue for tiiuae/falcon-7b-instruct