tgi Search Results - Githubissues

opea-project/GenAIExamples #330

Document / support for using BFLOAT16 with (Xeon) TGI servic…

The model used for ChatQnA supports BFLOAT16, in addition to TGI's default 32-bit float type: https://huggingface.co/Intel/neural-chat-7b-v3-3 TGI memory usage halves from 30GB to 15GB (and also it…

eero-t updated 15 hours ago

opea-project/GenAIComps #230

Request to upgrade TGI image to 2.0

Please update TGI image to 2.0 from 1.4 in all TGI readme files. I faced issues with Phi-3 model.

dhandhalyabhavik updated 5 days ago

opea-project/GenAIExamples #142

ChatQnA Example - Unable to run due to Issues

I am trying to run the ChatQNA application - Able to run all the microservices using docker compose file. - Getting this error in the tgi-service, What is the correct way to provide the external IP…

yogeshmpandey updated 2 weeks ago

huggingface/text-generation-inference #2121

TGI keeps crashing with 'device-side assert triggered'

### System Info Text-generation-inference: v2.1.0+ Driver Version: 535.161.08 CUDA Version: 12.2 3 GPU: DGX with 8xH100 80GB ### Information - [x] Docker - [ ] The CLI directly ### Tasks - [x…

stefanobranco updated 5 days ago

TrelisResearch/one-click-llms #6

IDEFICS 2 8B TGI

When manually launching my fine-tune of idefics2, Huggingface TGI says `Unsupported model type idefics2`. How did you get the Idefics2 TGI to run on runpod?

matbee-eth updated 1 month ago

huggingface/text-generation-inference #1873

[Question] Onnx support in TGI

### Feature request Apologies if this should be elsewhere, but I'm curious if you plan on adding support for onnx models like https://huggingface.co/microsoft/Phi-3-mini-128k-instruct-onnx ### M…

Ben-Epstein updated 2 weeks ago

stanfordnlp/pyreft #63

[P1] TGI and vLLM support

1. Are there plans for inference support. This is needed if it's to be used by devs in production. 2. Is fine tuning much faster than LoRA? - Optimization and backward pass are MUCH faster, but sure…

RonanKMcGovern updated 2 months ago

opea-project/GenAIExamples #329

Suspicous hostIPC usage

Related to #258, why services are using `hostIPC` option [1]: ``` $ git grep hostIPC ChatQnA/kubernetes/manifests/chaqna-xeon-backend-server.yaml: hostIPC: true ChatQnA/kubernetes/manifests/e…

eero-t updated 8 hours ago

theopenconversationkit/tock #1634

[GenAiOrchestrator] Add Text Generation Inference (huggingfa…

Add integration for [TGI](https://github.com/huggingface/text-generation-inference) LLM provider.

Benvii updated 1 week ago

MiuLab/Taiwan-LLM #59

Support for AWQ quantization in TGI

Hi As I tried with 13b version in TGI, it works fine with bitsandbytes quantization. While trying with AWQ quantization in TGI, it shows error as "Cannot load 'awq' weight, make sure the model is al…

nigue3025 updated 1 month ago

1000+ results for tgi

1000+ results
for tgi