-
**Describe the bug**
I am using 0.9.4 version after reflecting the two PRs below, but cpu inference is not working
CUDA optional deepspeed ops https://github.com/microsoft/DeepSpeed/pull/2507
Enabl…
-
Dear EleutherAI team,
I've noticed that the weights associated with the recently added "step0" and "step1" checkpoints are identical for all pythia models:
```
def main():
print(f"========…
-
**Describe the bug**
RuntimeError: The expanded size of the tensor (1) must match the existing size (10) at non-singleton dimension 2. Target sizes: [1, 4, 1, 10]. Tensor sizes: [1, 1, 10, 10]
F…
-
First of all, I congratulate you on this project.
I am on Ubuntu 22.04 platform.
I wanted to try https://huggingface.co/cakewalk/ggml-q4_0-stablelm-tuned-alpha-7b/blob/main/ggml-model-stablelm-tune…
-
LlamaTokenizer.from_pretrained('KRAFTON/KORani-v1-13B')
`
309 def LoadFromFile(self, arg):
--> 310 return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
TypeError: not a s…
-
### Describe the bug
I am trying to finetune Llama-2 with raw textfile data.
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Reproduction
My llama file is t…
-
Hi!
I am trying to finetune MPT-7B with LoRA configurations.
# Model
model_name = "mosaicml/mpt-7b"
config = transformers.AutoConfig.from_pretrained(
model_name,
trust_remote_code=…
-
Hi guys.
If I translate the datasets, will they work with pygmalion? I want to translate the datasets into portuguese.
-
Hi, I've tested mpt-7b-instruct and It does understand Chinese, but what confuses me is that the tokenizer that mpt-7b uses does not support Chinese, it's English only. So, how to understand this? Is …
-
python3 Andromeda/build_dataset.py --seed 42 --seq_len 8192 --hf_account "" --tokenizer "EleutherAI/gpt-neox-20b" --dataset_name "EleutherAI/the_pile_deduplicated"
Traceback (most recent call las…