-
### System Info
```
Ubuntu 20.04
Python 3.10.14
torch 2.3.0
transformers 4.42.3
bitsandbytes 0.42.0
CUDA Version: 12.4
GPU 3090
torch.cuda.is_avai…
-
### System Info
- `transformers` version: 4.40.0
- Platform: Linux-6.1.58+-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.22.2
- Safetensors version: 0.4.3
- Accele…
-
In latest commit, https://huggingface.co/mosaicml/mpt-7b/commit/67cf22a4e6809edb7308dd0a2ae2c1ffb86f4984, BigDL throws below error when generate text.
INFO 2024-02-20 06:41:05,962 proxy 172.17.0.2 …
-
I want to quantize model from [open-flamingo](https://github.com/mlfoundations/open_flamingo) or https://github.com/open-mmlab/Multimodal-GPT (open-flamingo v1) before lora training,
https://github…
-
### System Info
python version: 3.11.9
transformers version: 4.44.2
accelerate version: 0.33.0
torch version: 2.4.0+cu121
### Who can help?
@gante
### Information
- [X] The official example sc…
-
### 🐛 Describe the bug
When using `torch.nn.functional.scaled_dot_product_attention` with autograd a tensor filled with NaN values are returned after a few backward passes. `Using torch.autograd.s…
-
### Expected Behavior
It should produce a video using the LTX-Video workflow
### Actual Behavior
Pop-up with error `The expanded size of the tensor (192) must match the existing size (768) at…
-
### 🐛 Describe the bug
''' checkpoint_path = './llama_relevance_results'
training_args = transformers.TrainingArguments(
#remove_unused_columns=False, # Whether or not to automatically r…
-
HeadConfig(
name=f"num_tokens_regression",
layer_hook=-7,
hidden_size=128, # MLP hidden size
num_layers=3, # 2 hidden layers in MLP
in_size=hidden_s…
-
- [x] Usage in the [decoder](https://github.com/emma-mens/transformers/blob/main/src/transformers/models/opt/modeling_opt.py#L316) layer and the corresponding `past_key_values` [usage](https://github.…