-
### Priority
Undecided
### OS type
Ubuntu
### Hardware type
Gaudi2
### Installation method
- [X] Pull docker images from hub.docker.com
- [ ] Build docker images from source
### Deploy method
…
-
### System Info
tgi-gaudi 2.0.4
Used below docker compose yaml to launch tgi-gaudi
Serve **llama3.1-70B-instruct model**
--top_k 10
--max_new_tokens 8192
--temperature 0.01
--top_p 0.95
…
-
### System Info
```shell
Bad
Optimum Habana latest main: c495f479d9abf04fb7adb6f0a5607d7963186649
Synapse docker image: v1.16
Good:
Optimum Habana one commit before Transformer 4.40 upgrade: 56…
-
I have been staging some updates testing the tgi-gaudi software with llama 405B fp8, i am waiting for habana optimum to approve the PR, and then I will submit a pr for huggingface/tgi_gaudi and will s…
-
### System Info
```shell
Optimum Habana version v1.12.1
Synapse 1.16.2
docker vault.habana.ai/gaudi-docker/1.16.2/ubuntu22.04/habanalabs/pytorch-installer-2.2.2:latest
```
### Information
- [ ] …
-
### System Info
```shell
optimum 1.21.4
optimum-habana 1.14.0.dev0
transformers 4.45.2
+------------------------------------------------------------------…
-
### System Info
```shell
System Configuration: Single node Habana Gaudi setup
Firmware Version: hl-1.15.0-fw-48.2.1.1
Software Stack: Synapse AI 1.15
```
### Information
- [ ] The official examp…
-
### Your current environment
The output of `python collect_env.py`
```
NUMA node0 CPU(s): 0-39,80-119
NUMA node1 CPU(s): 40-79,120-159
Vulnerability Gath…
-
### Your current environment
model: Qwen2-7B
input_length: 16384, 30720
batch_size: 1,5,10,20,50
### 🐛 Describe the bug
how use habana vllm to support long sequence?
my serving scrip…
-
### Your current environment
```text
The output of `python collect_env.py`
```
### 🐛 Describe the bug
seq_group_metadata_list.extend(
self.create_dummy_seq_group_metadata(0, 0, is_prompt)
for…