tgi Search Results - Githubissues

1000+ results
for tgi

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

modularml/mojo #446

[Feature Request] Will mojo support M2 ultra to use LLM of …

### Review Mojo's priorities - [X] I have read the [roadmap and priorities](https://docs.modular.com/mojo/roadmap.html#overall-priorities) and I believe this request falls within the priorities. ###…

SaraiQX updated 2 months ago
2
huggingface/text-generation-inference #1466

Support SGLang RadixAttention

### Feature request This framework use an advance prefix cache technique to accelerate inference. It brings more than 30% improvement compare to TGI without speculate. Is it possible to integrated in…

XianglongTan updated 3 months ago
9
huggingface/text-generation-inference #907

What would it take to support multiple LoRAs with a single b…

### Feature request An increasingly common question is how to support inference for multiple LoRA models running against a single backbone model. What's preventing TGI from implementing a feature lik…

ToddMorrill updated 1 month ago
6
zylon-ai/private-gpt #1642

please how to support gemma model

please how to support gemma model

linhcentrio updated 4 months ago
5
huggingface/text-generation-inference #1553

support for RWKV/HF_v5-Eagle-7B

### System Info tgi 1.3 ubuntu 18 python 3.10 ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An officially supported command - [ ] My own modifications ### Reproduction C…

vitalyshalumov updated 3 months ago
1
huggingface/text-generation-inference #688

AWS Inferentia (inf1, inf2) support

### Feature request I can't find any guidance on integrating HuggingFace TGI and AWS Inferentia. I've found several documents about deployment guides for individual end-to-end models, but I don't se…

OrigamiDream updated 2 months ago
5
TimDettmers/bitsandbytes #712

low performance with Llama-2-13b-hf

I tried to quantify model **Llama-2-13b-hf** using bitsandbytes, but I found that int4 inference performance is lower than fp16 inference, whether it is in A100 or 3090. Can you tell me why and how …

PYNing updated 2 months ago
4
huggingface/text-generation-inference #1641

Medusa models seem to be slower than the original base model…

### System Info Thank you for adding support for Medusa. In my comparison of Medusa models versus the original base models with TGI, the latter appeared to be quicker. I tested the below models:…

infinitylogesh updated 2 months ago
1
langchain-ai/langchain #19857

Bug Report: Issue with LangChain and OpenAI Template Handli…

### Checked other resources - [X] I added a very descriptive title to this issue. - [X] I searched the LangChain documentation with the integrated search. - [X] I used the GitHub search to find a…

lavrenalex updated 2 months ago
3
betterscientificsoftware/bssw.io #1411

Internal board management process

Current process 1. Issues in backlog: Every new issue, issues that are not under development but are being tracked for various reasons (for ex: issues that are (1) interesting but no resources ava…

rinkug updated 6 months ago
8

上一页 1...93 94 95 96 97 98 99...100 下一页

1000+ results for tgi

1000+ results
for tgi