-
### What happened?
Since commit b3188 llama-cli produce incoherent output on multi-gpu system with CUDA and row tensor splitting.
Layer tensor split works fine but is actually almost twice slower.
…
-
### Bug Report
- I am playing around with a Nobara Linux OS installation (based on Fedora 39) and I just tried one of my older gguf's (mistral-7b-code-16k-qlora.Q4_K_M.gguf) and it crashes GPT4All …
-
-
Subscribe to this issue and stay notified about new [daily trending repos in Go](https://github.com/trending/go?since=daily)!
-
Over the holiday, I started writing down some requirements for ChatOps to at least get the conversation started. My hope is that while we're working on the current sprint, at least having seen this th…
-
Thanks for the great product.
I am so impressed with your research that I have tried it many times.
However, the results with Gemma-2-9B are very different from your results.
The score was even I…
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
the below is my code.
```
import torch
# from transformers import BitsAndBytesConfi…
-
According to the instructions, we can add a make_db.py database to auth.json , but does not specify exactly how to do this.
```
To make a new one for the user, fill `user_path_jon` with documents (…
rxng updated
4 months ago
-
# Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [x] I am running the latest code. Development is very rapid so there are no tagged versions as of…
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
Hi,
no matter If I use this small snippet, or use my sophisticated Application.…
ghost updated
5 months ago