-
While Using other models like **meta-llama/Meta-Llama-3.1-8B-Instruct**
I'm encountering a **torch.OutOfMemoryError** when trying to load a model on multiple GPUs. I have 4 GPUs, each with 14.57 GiB …
-
# Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [x ] I am running the latest code. Development is very rapid so there are no tagged versions as o…
-
Hi,
Environment: kubeai:0.4.1
Difficult to write the right title for this one, but basically I added a model thanks to the new documentation provided last night - thank you!
```
apiVersion: …
-
### 🐛 Describe the bug
```
torchrun --nproc-per-node 8 dist_run.py
```
```
known configs: ['13B', '30B', '34B', '70B', '7B', 'CodeLlama-7b-Python-hf', 'Mistral-7B', 'stories110M', 'stories15M',…
-
### What happened?
Hi, when stress testing llama-server (--parallel 3, prompt="Count 1 to 10000 in words") and running deepseek-coder-v2:16b-lite-instruct-q8_0 i got this assertion error in the logs…
-
(Note this is a WIP and will be added to)
This is a meta issue to keep track of the body of work required to:
* migrate keybinding dispatch to app-model
* ensure npe2 plugin keybinding contribu…
-
Congratulations on your paper!
I was wondering if this kind of approach can be applied more broadly.
Has the output model size been reduced in your implementation? If so, by how much?
Do you thin…
DSgUY updated
4 hours ago
-
I only modified t6 instead of t4, t4 t5 both work well for this model,but if we set the thread=6,will always trigger the problem on my XIAOMI14Pro(SM8650 8Gen3)
please check it for resolve
thanks~
…
-
## Package version
ex: v4.4.0
## Describe the bug
When using a wildcard index in the model and a whereExact query the mapping lookup fails.
![image](https://github.com/user-attachments/assets/1…
-
# Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [ ] I am running the latest code. Development is very rapid so there are no tagged versions as of…