tgi Search Results - Githubissues

1000+ results
for tgi

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

LLaVA-VL/LLaVA-NeXT #7

How to deploy this model via API?

How do we deploy this model via API? Can I deploy it on vLLM or lmdeploy? I can't find any example to run this with HuggingFace transformers. I want to deploy 72b and 110b model

Iven2132 updated 5 months ago
2
salimvanak/myWMS #2

Do you still working on this

Do you still working on tgis project

paja-talic updated 1 year ago
1
huggingface/text-generation-inference #2223

Gibberish generated with deepseek-ai/deepseek-coder-6.7b-bas…

### System Info TGI Version: Tried 2.0.3, 2.0.4, 2.1.1 all does not work but 2.0.2 works ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command - [ ]…

zch-cc updated 2 months ago
5
huggingface/text-generation-inference #2503

Add support for Idefics 3

### Model description Please add support for HuggingFaceM4/Idefics3-8B-Llama3 in tgi: _Idefics3 is an open multimodal model that accepts arbitrary sequences of image and text inputs and produces t…

stelterlab updated 1 month ago
3
WisdomShell/codeshell #42

使用TGI本地部署时无法并行

在使用TGI本地部署并通过text-generation-launcher运行模型时，会一直报错：Server error: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cuda:1! ![image](https://github.com/WisdomShel…

perryrighthere updated 11 months ago
1
huggingface/optimum-neuron #664

Add support for Llama3.1

### Feature request Llama 3.1 is out and should be compatible with Neuron, however, it requires `transformers==4.43.1`, and `optimum-neuron` has pinned `transformers` to `4.41.1`. Notes that sin…

dacorvo updated 3 weeks ago
8
opea-project/GenAIComps #764

[Feature] Implement Health Check Endpoint for Delayed Servic…

OS type Ubuntu Description When running the example Translation using Docker Compose, one of the images takes additional time to pull a model from the Huggingface upon startup. During this period…

isaacncz updated 1 day ago
6
dottxt-ai/outlines #1055

Suite of `outlines.processors` for Sampling Techniques and D…

### Presentation of the new feature Logits processors in outlines.processors support nearly every inference engine, offering a "write once, run anywhere" implementation of business logic. Curren…

lapp0 updated 2 months ago
2
opendatahub-io/caikit-tgis-serving #226

[RFE] Add openai text generation API compatibility layer in …

Request The ask is to introduce a openai text generation API compatibility layer (chat completion endpoint) to kserve/TGIS. Why Having an openai API compatibility layer will allow more open sourc…

ashishkamra updated 7 months ago
2
huchenlei/ComfyUI_omost #41

prompt generation very slow

For 4096 token(which is forced by omost), use llama-3 model at 4090, it take 120s to complete prompt. And it take only 7s for SD. It's a big gap. How can we accelerate the local GPT?

sipie800 updated 4 months ago
1

上一页 1...15 16 17 18 19 20 21...100 下一页

1000+ results for tgi

1000+ results
for tgi