-
### System Info
- `transformers` version: 4.46.0
- Platform: Linux-5.15.0-97-generic-x86_64-with-glibc2.35
- Python version: 3.12.3
- Huggingface_hub version: 0.26.1
- Safetensors version: 0.4.…
-
Hi, thank you for your great work! I'd like to reproduce full parameter fine-tuning of dpo training. However I only have 10 * Nvidia A40 GPUs (46 Gbs memory each).
I tried the command
`CUDA_VI…
-
### 🐛 Describe the bug
Currently, when using FSDP, the model is loaded for each of the N processes completely on CPU leading to huge CPU RAM usage. When training models like Flacon-40B with FSDP on…
-
### System Info
```Shell
Copy-and-paste the text below in your GitHub issue
- `Accelerate` version: 1.0.0
- Platform: Linux-6.10.11-amd64-x86_64-with-glibc2.40
- `accelerate` bash location: /dis…
-
Hi,
Are there any reasons why it doesn't work with python > 3.10?
For example when trying to run `pip install first-breaks-picking-gpu` I get error:
```
ERROR: Ignored the following versions tha…
-
Vulnerable Library - bert_score-0.3.13-py3-none-any.whl
Path to dependency file: /packages/bert/requirements.txt
Path to vulnerable library: /packages/bert/requirements.txt
Found in HEAD commit:…
-
### 🐛 Describe the bug
Hello, when I am using DDP to train a model, I found that using multi-task loss and gradient checkpointing at the same time can lead to gradient synchronization failure betwe…
-
## Description
Relatively minor, but explicitly omitting `allow-same-origin` from the help widget iframe `sandbox` attribute in packages/help-extension breaks search pages on many reference documen…
-
**Describe**
Model I am using : TextDiffuser
Hi, thanks for the great work. I'm trying to train the model on the portion of Mario-Laion image dataset (~50k images).
But currently the images generat…
-
# DataLoader architecture updates and TarDataset implementation
# Problem statement
This proposal aims to construct a modular, user-friendly, and performant toolset to address the ambiguous activi…