-
I have noticed that my fine-tuned versions of the phi-3.5-mini model generate incoherent content when exceeding an output length of 4096 tokens. I could reproduce this behaviour with the base-model as…
-
### System Info
System Info
TGI Docker Image: ghcr.io/huggingface/text-generation-inference:sha-11d7af7-rocm
MODEL: meta-llama/Llama-3.1-405B-Instruct-FP8
Hardware used:
Intel® Xeon® Platinum 8…
Bihan updated
2 weeks ago
-
### System Info
`pip install text-generation `with version '0.6.0'
I need to use python package not docker
### Information
- [ ] Docker
- [ ] The CLI directly
### Tasks
- [ ] An officially suppo…
-
### System Info
- text-generation-inference:2.3.0, deployed on docker
- model info:
{
"model_id": "meta-llama/Llama-3.1-8B-Instruct",
"model_sha": "0e9e39f249a16976918f6564b8830bc894c89659…
-
### System Info
TGI from Docker
text-generation-inference:2.2.0
host: Ubuntu 22.04
NVIDIA T4 (x1)
nvidia-driver-545
### Information
- [X] Docker
- [ ] The CLI directly
### Tasks
- [X] An o…
-
**Is your feature request related to a problem? Please describe.**
Text Generation Inference serves lots of models very quickly
**Describe the solution you'd like**
Specify Text Generatio…
-
### Model/Pipeline/Scheduler description
ConsistencyTTA, introduced in the paper [_Accelerating Diffusion-Based Text-to-Audio Generation
with Consistency Distillation_](https://arxiv.org/abs/2309.…
-
```
cd /root/workspace/github/optimum-habana/examples/text-generation/
python run_generation.py \
--model_name_or_path /root/workspace/model/meta-llama/Llama-3.1-8B/ \
--use_hpu_graphs \
--use_kv…
-
无法加载模型,后台报错:
1.当前版本为PYTHON 3.11.9,之前试过PYTHON3.10.X也报错
2. xinference . 0.12.2
3. pip install "xinference[all]" 安装后本地环境运行
4. Full stack of the error.
Traceback (most recent call last):
File …
-
We were able to finetune it in our customized dataset. However, when we tried to inference by loading the fine-tuned checkpoint,
we have :
```
File "/data/shaokai/miniconda3/envs/llava/lib/python…