-
### Describe the bug
Intel® Gaudi AI SW Tools Operator version 0.0.1 fails to deploy on OCP 4.16.0
**Message displayed:**
Operator failed
install failed: deployment gaudi-ai-sw-tools-operator-co…
-
### Priority
P3-Medium
### OS type
Ubuntu
### Hardware type
Xeon-ICX
### Installation method
- [ ] Pull docker images from hub.docker.com
- [ ] Build docker images from source
…
-
Intel Gaudi Software 1.19.0 will be relased mid December 2024 with Torch 2.5.0.
Python version TBD, t's either going to be Python 3.11 or Python 3.12.
Next instructlab downstream release has to sup…
-
### Your current environment
```text
The output of `python collect_env.py`
```
### How you are installing vllm
We are using `v0.5.3.post1+Gaudi-1.18.0` and image https://github.com/HabanaAI/vll…
-
InstructLab has train configuration profiles for NVIDIA A100, L40, and L4, but not for Intel Gaudi. Please add a profile for Intel Gaudi 3 systems.
An Intel Gaudi 3 server comes as an 8-way system.…
-
**Feature Overview (aka. Goal Summary)**
Implement Intel Gaudi support in InstructLab project, so Gaudi 2 and Gaudi 3 can be used for SDG, evaluation, and training.
**Goals (aka. expected user outcom…
-
### System Info
```shell
When I use the k8s sample example for lora for llama3 8B model it works fine. But for 70b model it fails with OOM.
Total number of GPUs: 8 x Gaudi3 GPUs
Dataset: databr…
-
- Intel has a vllm fork, so ensure that’s up to date for their latest drivers (1.18?)
- build and test it
-
### System Info
```shell
Google Colab (CPU runtime)
```
### Information
- [X] The official example scripts
- [ ] My own modified scripts
### Tasks
- [X] An officially supported task in the `exam…
-
**Is your feature request related to a problem? Please describe.**
InstructLab 0.19.3 and its dependencies limit PyTorch to