tgi Search Results - Githubissues

1000+ results
for tgi

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

BlinkDL/RWKV-LM #232

RWKV 5 supported vLLM？LMdeploy？TGI？Fastllm？FasterTransformer…

RWKV 5 supported vLLM？LMdeploy？TGI？Fastllm？FasterTransformer？ What should I do to get the inference performance？like throughput, token latency and latency？

lanzhoushaobing updated 3 months ago
1
stanfordnlp/pyreft #71

[P1] Location of code for "LM training and serving with ReFT…

The ReadMe mentions the ability to serve at scale with continuous batching. Even if not vLLM or TGI, is there some work that someone could point me to on this? Is there any functioning packaging…

RonanKMcGovern updated 2 months ago
2
huggingface/optimum-neuron #466

Bugs for TGI-NeuronX DLC for llama-2 70B on Trn1.32xlarge

I experienced failure using TGI-NeuronX DLC on ml.trn1.32xlarge for llama-2 70B. * I am able to compile successfully on inf2.48xlarge with context length of 2K, batch size of 4, and TP of 24 and furt…

Neo9061 updated 9 hours ago
8
outlines-dev/outlines #845

Handle Medusa speculative decoding also with outline

### What behavior of the library made you think about the improvement? As of now Medusa is generating hallucinations as the speculative multihead is not supporting the outline decoding grammar. …

jqueguiner updated 1 month ago
1
continuedev/continue #438

HuggingFace TGI Codellama support

**Is your feature request related to a problem? Please describe.** A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] HuggingFace TGI is a standard way to…

taoari updated 8 months ago
26
huggingface/text-generation-inference #2043

ROCm: Server error: transport error when running batch size …

### System Info image: text-generation-inference:sha-bf3c813-rocm GPU: AMD MI250 TGI args: --dtype float16 --model-id tiiuae/falcon-11B PS. tested on meta-llama/Llama-2-7b-hf, no issues ###…

almersawi updated 3 weeks ago
1
huggingface/doc-builder #509

Feature Request: Mermaid support

Hi, it would we awesome to have mermaid support. I'm not sure if this would be helpful to others but I can look into adding support in the future (unless someone else is working on this sooner)

drbh updated 2 weeks ago
1
huggingface/chat-ui #1311

400 (no body) trying to reach openai compatible server

Hi everyone, I have the following setup (containers are on the same device): - Container 1: Nvidia NIM (openai-compatible) with Llama3 8B Instruct, port 8000; - Container 2: chat-ui, port 3000. …

edesalve updated 1 week ago
1
huggingface/text-generation-inference #1969

TGI 2.0.3 fails to serve CodeLlama models that 2.0.1 support…

### System Info Running a TGI 2.0.3 docker on a 8 NVIDIA_L4 VM. Command: ```bash MODEL=codellama/CodeLlama-70b-Python-hf docker run \ -m 320G \ --shm-size=40G \ -e NVIDIA_VISIBLE_DEVIC…

KCFindstr updated 2 weeks ago
12
huggingface/text-generation-inference #1987

Gemma not starting with tensor parallelism

### System Info latest TGI docker image ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command - [ ] My own modifications ### Reproduction 1. Use …

arunpatala updated 2 days ago
1

上一页 1...3 4 5 6 7 8 9...100 下一页

1000+ results for tgi

1000+ results
for tgi