-
Obsoletes #147, #150, https://github.com/ggerganov/llama.cpp/issues/1575, https://github.com/ggerganov/llama.cpp/issues/1590, https://github.com/rustformers/llm/discussions/143, and probably some othe…
-
### System Info
```Shell
- `Accelerate` version: 0.20.3
- Platform: Linux-5.15.0-1023-aws-x86_64-with-glibc2.2.5
- Python version: 3.8.11
- Numpy version: 1.24.3
- PyTorch version (GPU?): 2.0.1+c…
-
After fine-tuning llama with lora, how to load through multi-gpu?
Are there any examples?
-
Trying simple example on m1 mac:
```
from ctransformers import AutoModelForCausalLM
llm = AutoModelForCausalLM.from_pretrained(
"/path/to/starcoderbase-GGML/starcoderbase-ggml-q4_0.bin",
…
-
### 🐛 Describe the bug
I want to deploy the gpt2 model, I have deployed the environment on my server (centos 7), and then run the sample Text Generation of Huggingface Transformers, but it fails.
…
-
### Duplicates
- [X] I have searched the existing issues
### Steps to reproduce 🕹
_No response_
### Current behavior 😯
When using Chinese text, the length increases after encoding, which may caus…
-
First, Thanks for sharing your great research.
I have reviewed the paper and the code, and it appears to be a form of adding kerple bias to the attention score.
However, since the code is in neo…
-
:/
-
System: M1 Mac
With vanilla ctranslate2 (installed via pip), I was unable to use more than 1 threads and was getting this warning when i try to increase the threads "The number of threads (intra_th…
-
### System Info
```shell
Optimum 1.5.1
Transformers 4.25.1 (the training was fine for 4.24.0)
```
### Who can help?
@JingyaHuang
### Information
- [X] The official example scripts…