-
-
### Context and Issue
I'm attempting to quantize the model [alpindale/goliath-120b](https://huggingface.co/alpindale/goliath-120b) using [royallab/PIPPA-cleaned](https://huggingface.co/datasets/royal…
-
### Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a sim…
-
I'm using exllamav2 for the first time. I built from source today (minus the ~3 commits that break Windows builds). I saw some strange behavior and bisected it to the commit in the title.
Expected:…
-
Hi, I tried the new quant method (`master` branch) with goliath 120b using the built-in calibration dataset (not specifying -c parameter).
`-b 3.0 -hb 8 -rs 1.0`
`# Module quantized, calibration pe…
-
Important note: it is working, but only with Vulkan, not with ROCm. However, I have installed ROCm 5.5 and it does support RX 7900 XTX: https://rocm.docs.amd.com/en/docs-5.5.1/release/windows_support.…
-
**Purpose**: This issue compiles meeting notes for the Gno Core Staff's recurring meetings.
**Process**:
1. **Drafting**: Notes are initially taken in Hackmd or Google Docs during meetings.
2. *…
-
I am facing below very frequently, mostly 1 out 20 times
```
ws = connect(
/usr/local/python/python-3.10/std/lib64/python3.10/site-packages/websockets/sync/client.py:289: in connect
conn…
-
When attempting to split the model on multiple GPUs, I get the following error:
```
> python test_chatbot.py -d /home/john/Projects/Python/text-models/text-generation-webui/models/TheBloke_guanaco…
-
Hi there, thanks for all the hard work.
My system has 2x4090.
First, the FP16 model works if using bitsandbytes 4bit, with decent speeds.
`Output generated in 81.62 seconds (4.47 tokens/s, 36…