-
### What is the issue?
Ollama is failing to run on GPU instead it uses CPU. If I force it using `HSA_OVERRIDE_GFX_VERSION=9.0.0` then I get `Error: llama runner process has terminated: signal: abo…
-
### System Info
```shell
torch install path ............... ['/home/chatgpt/.local/lib/python3.10/site-packages/torch']
torch version .................... 2.1.2+cu121
deepspeed install path ..…
-
# URL
- https://arxiv.org/abs/2306.15448
# Affiliations
- Kanishk Gandhi, N/A
- Jan-Philipp Fränken, N/A
- Tobias Gerstenberg, N/A
- Noah D. Goodman, N/A
# Abstract
- As Large Language Model…
-
As the title describes, I slightly modified the QWenAttention.
Before:
![image](https://github.com/NVIDIA/TensorRT-LLM/assets/16505966/9ee6d300-5f92-489d-8022-cdf467f9acb2)
After:
![image](https:/…
-
### What is the issue?
This is possibly related to the fix for #4028. I updated to the 0.1.33 release and pulled the latest `mixtral:8x22b-instruct-v0.1-q4_0` (`6a0910fa6dc1`), so I'm running an 80…
-
Hello, guys. Thank you for all your great work on this awesome project!
I am currently building a new deep learning acceleration framework with it.
But I have some problems with it now. Hope you cou…
-
### 🐛 Describe the bug
ValueError: Target modules {'v_proj', 'up_proj', 'o_proj', 'down_proj', 'k_proj', 'q_proj', 'gate_proj'} not found in the base model. Please check the target modules and try …
-
Hi All, I'm trying to do inference using galactica-6.7B model but errors have been popping up after inferencing few examples, and I'm not sure what to do. Can anyone look at them and tell?
followin…
-
Hi!
Let's bring the documentation to all the Korean-speaking community 🌏 (currently 9 out of 77 complete)
Would you want to translate? Please follow the 🤗 [TRANSLATING guide](https://github.com…
-
Hell there.
I was wondering if weighting was a feature that is coming out, or if it is even on your radar.
I currently use a couple different free stat tools to run frequencies and cross tabs for …