text-generation-inference Search Results

1000+ results
for text-generation-inference

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

unslothai/unsloth #1199

Phi-3.5-mini generation becomes instable after 4096 tokens

I have noticed that my fine-tuned versions of the phi-3.5-mini model generate incoherent content when exceeding an output length of 4096 tokens. I could reproduce this behaviour with the base-model as…

NicolasSteen updated 1 week ago
1
huggingface/text-generation-inference #2654

TGI does not support FP8 quantized models on ROCm

### System Info System Info TGI Docker Image: ghcr.io/huggingface/text-generation-inference:sha-11d7af7-rocm MODEL: meta-llama/Llama-3.1-405B-Instruct-FP8 Hardware used: Intel® Xeon® Platinum 8…

Bihan updated 2 weeks ago
5
huggingface/text-generation-inference #2541

How to serve local models with python package (not docker)

### System Info `pip install text-generation `with version '0.6.0' I need to use python package not docker ### Information - [ ] Docker - [ ] The CLI directly ### Tasks - [ ] An officially suppo…

hahmad2008 updated 1 month ago
2
huggingface/text-generation-inference #2615

Excessive use of VRAM for Llama 3.1 8B

### System Info - text-generation-inference:2.3.0, deployed on docker - model info: { "model_id": "meta-llama/Llama-3.1-8B-Instruct", "model_sha": "0e9e39f249a16976918f6564b8830bc894c89659…

ukito-pl updated 3 weeks ago
1
huggingface/text-generation-inference #2456

Running TGI on NVIDIA T4

### System Info TGI from Docker text-generation-inference:2.2.0 host: Ubuntu 22.04 NVIDIA T4 (x1) nvidia-driver-545 ### Information - [X] Docker - [ ] The CLI directly ### Tasks - [X] An o…

ivoras updated 1 month ago
3
mudler/LocalAI #1042

feature: Text Generation Inference backend

**Is your feature request related to a problem? Please describe.** Text Generation Inference serves lots of models very quickly **Describe the solution you'd like** Specify Text Generatio…

jackielii updated 1 year ago
5
huggingface/diffusers #8414

[🌟 New Model] ConsistencyTTA: Accelerating Diffusion-Based T…

### Model/Pipeline/Scheduler description ConsistencyTTA, introduced in the paper [_Accelerating Diffusion-Based Text-to-Audio Generation with Consistency Distillation_](https://arxiv.org/abs/2309.…

Bai-YT updated 3 weeks ago
9
NTUT-intel-Gaudi/records #2

RuntimeError: [Rank:0] FATAL ERROR :: MODULE:PT_DEVMEM Alloc…

``` cd /root/workspace/github/optimum-habana/examples/text-generation/ python run_generation.py \ --model_name_or_path /root/workspace/model/meta-llama/Llama-3.1-8B/ \ --use_hpu_graphs \ --use_kv…

James-Lu-none updated 5 days ago
1
xorbitsai/inference #1687

BUG-运行Qwen2-MoE-14B-GPTQ4 模型出错

无法加载模型，后台报错： 1.当前版本为PYTHON 3.11.9，之前试过PYTHON3.10.X也报错 2. xinference . 0.12.2 3. pip install "xinference[all]" 安装后本地环境运行 4. Full stack of the error. Traceback (most recent call last): File …

charliboy updated 3 months ago
4
LLaVA-VL/LLaVA-NeXT #284

can't load a fine-tuned llava-onevision

We were able to finetune it in our customized dataset. However, when we tried to inference by loading the fine-tuned checkpoint, we have : ``` File "/data/shaokai/miniconda3/envs/llava/lib/python…

yeshaokai updated 2 weeks ago
2

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for text-generation-inference

1000+ results
for text-generation-inference