text-generation-inference Search Results

1000+ results
for text-generation-inference

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/text-generation-inference #2440

[Volta] [No flash attention] Llama 3.1 8B Instruct failed to…

### System Info Hi everyone, when trying to update from Llama 3 8B Instruct to Llama 3.1 8B Instruct, I noticed a crash: ```bash Args { model_id: "meta-llama/Meta-Llama-3.1-8B-Instruct", …

ladi-pomsar updated 1 month ago
2
hyperonym/basaran #147

Any chance to support text-generation-inference as backend?

https://github.com/huggingface/text-generation-inference Main features of TGI are quite awesome. It woud be nice to make it additional inference implementation.

hewr1993 updated 1 year ago
1
huggingface/text-generation-inference #2332

TGI on NVIDIA GH200 (Arm64)

### Feature request Would it be possible to build/publish an arm64 container image for the text-generation-inference? I would like to be able to run it on a NVIDIA GH200 which is an arm64-based syst…

dartcrossett updated 2 months ago
3
huggingface/text-generation-inference #2413

Tool call performs worse on v2.2.0 as compared to latest

### System Info ```bash gpu=0 num_gpus=1 model=meta-llama/Meta-Llama-3.1-8B-Instruct docker run -d \ --gpus "\"device=$gpu\"" \ --shm-size 16g \ -e HUGGING_FACE_HUB_TOKEN=$token \ -p 8082:80 …

varad0309 updated 2 months ago
6
vllm-project/vllm #9519

[Bug]: Multiple inconsistencies wrt BOS injection and BOS du…

### Your current environment 0.6.3.post1 ### 4 🐛generation scenarios There are at least 4 generation use cases in vLLM: 1. offline generate 2. offline chat 3. online completion (similar …

stas00 updated 3 weeks ago
6
huggingface/text-generation-inference #2275

Latest Docker image fails while initializing gemma2

### System Info I tried the following systems, both with the same exception: - ghcr.io/huggingface/text-generation-inference:sha-6aebf44 locally with docker on nvidia rtx 3600 - ghcr.io/huggingface…

jorado updated 3 months ago
3
microsoft/DeepSpeedExamples #867

[Bug] DeepSpeed Inference Does not Work with LLaMA (Latest v…

## Version deepspeed: `0.13.4` transformers: `4.38.1` Python: `3.10` Pytorch: `2.1.2+cu121` CUDA: 12.1 ## Error in Example (To reproduce) Just simply run this script https://github.com/micr…

allanj updated 8 months ago
3
WisdomShell/codeshell-vscode #35

通过text-generation-inference部署时报错

命令如下 ``` docker run --gpus 'all' --shm-size 1g -p 9090:80 -v $HOME/codeshell/CodeShell-7B-Chat:/data \ --env LOG_LEVEL="info,text_generation_router=debug" \ ghcr.nju.edu.cn/hugging…

Halflifefa updated 11 months ago
1
unslothai/unsloth #1055

Exception: CUDA error: an illegal memory access was encount…

I attempted to serve the original base model of **Llama 3.1** in 4-bit, both with and without setting `load_in_4bit`. Below are my observations. When `load_in_4bit = True`: The model throws the f…

vhiwase updated 2 weeks ago
4
huggingface/text-generation-inference #2143

Unable to load the local model file into LoRA adaptors

### System Info ` text-generation-launcher 2.1.0 ` ### Information - [X] Docker - [X] The CLI directly ### Tasks - [ ] An officially supported command - [ ] My own modifications ### Reprod…

mhou7712 updated 1 month ago
29

上一页 1...9 10 11 12 13 14 15...100 下一页

1000+ results for text-generation-inference

1000+ results
for text-generation-inference