tgi Search Results - Githubissues

1000+ results
for tgi

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

huggingface/text-generation-inference #1636

Need instructions for how to optimize for production serving…

### Feature request I suggest better educating developers how to download and optimize the model at build time (in container or in a volume) so that the command `text-generation-launcher` serves as f…

steren updated 2 months ago
1
QwenLM/Qwen #758

how to deploy finetuned model with vLLM

after I ran finetune script, It save the adapter weight how can I run it with vLLM or TGI to run if efficiently and fast ?

ronyadgar updated 3 months ago
5
huggingface/tgi-gaudi #43

Unable to load local model from the directory for TGI Gaudi…

### System Info [tgi-gaudi](https://github.com/huggingface/tgi-gaudi) V1.2. Error: Traceback (most recent call last): File "/usr/local/bin/text-generation-server", line 8, in sys.exit(a…

avinashkarani updated 3 months ago
15
hiroi-sora/Umi-OCR_v2 #61

Docker 容器 Ubuntu 运行 Umi-OCR_v2

# 软件版本 - [Umi-OCR_Rapid_dev_20231114.7z](https://github.com/hiroi-sora/Umi-OCR_v2/releases/download/dev%2F20231114/Umi-OCR_Rapid_dev_20231114.7z) 运行环境 - Ubuntu20.04 - wine-8.0.2 # 如图 …

iszmxw updated 7 months ago
5
huggingface/optimum-neuron #460

[QUESTION] What is the difference between optimum-neuron and…

I would like to understand the differences between this optimum-neuron and [transformers-neuronx](https://github.com/aws-neuron/transformers-neuronx).

leoribeiro updated 3 months ago
1
huggingface/text-generation-inference #1665

Inference Explainability/Suppression (AtMan)

### Feature request Flags for inference enrich the output with explainability information or suppress specific input token/embedding spaces, as described [here](https://github.com/Aleph-Alpha/AtMan).…

stefanobranco updated 2 months ago
1
opea-project/GenAIComps #111

Images built from Dockerfiles are 2x larger than they need/s…

*Problem* After building images using Dockerfiles in this repository, according to instructions here: https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/docker-composer/xeon/README.md …

eero-t updated 1 week ago
8
huggingface/text-generation-inference #976

Kubernetes deployment launcher process hanging

### System Info Using ghcr.io/huggingface/text-generation-inference:latest, but same issue with 0.9, and 1.02. Trying to deploy with model_id "tiiuae/falcon-7b-instruct" ### Information - [X] Dock…

rickvanveen updated 3 months ago
5
BerriAI/litellm #1739

[Feature]: Support TGI 'truncate' param

### The Feature TGI supports a truncate param for handling scenarios where max tokens > model limit support it ### Motivation, pitch user request ### Twitter / LinkedIn details _No response_

krrishdholakia updated 5 months ago
2
huggingface/text-generation-inference #2025

4bit quantized model using bnb not able to inference

### System Info tgi version - latest . The model is cohere aya 35B, 4bit bnb quantized model . Originally I quantized base model and merged finetuned adapters with it. ### Information - [X] …

abadjatya updated 3 weeks ago
1

上一页 1...91 92 93 94 95 96 97...100 下一页

1000+ results for tgi

1000+ results
for tgi