-
### 🐛 Describe the bug
When enabling amp training with PyTorch native autocast, I noticed there seems to be obvious difference for DDP based model and FSDP based model.
Here is a minimum example …
-
> Today we’re releasing the next step: QDoRA. This is just as memory efficient and scalable as FSDP/QLoRA, and critically is also as accurate for continued pre-training as full weight training. We thi…
-
The proposed work tasks are as below:
- [ ] Enable CI support for IBM Cloud to enhance the testing infrastructure for FSDP
- [ ] Benchmark new model(s) for FSDP training - e.g. add new hf_T5 with 3B…
-
I'm trying to continue train from checkpoint, but get some error, can you help to example code for it?
Model: `unsloth/tinyllama-bnb-4bit`
```
from trl import SFTTrainer
trainer = SFTTrainer(
…
-
(for later)
-
### Please check that this issue hasn't been reported before.
- [X] I searched previous [Bug Reports](https://github.com/OpenAccess-AI-Collective/axolotl/labels/bug) didn't find any similar reports.
…
-
Hello!
Thank you for your work at MLLM.
I had a fine-tuning bug that I couldn't fix: when I ran the `stage2_sft.sh` script and trained with speech_conv_datasets only, the logger showed that the trai…
-
When training without providing the `mixed_precision` argument to FSDP, there is an error related to dtype mismatch in `dinov2/layers/block.py`. Is this expected?
Full stacktrace:
```txt
File "/.…
-
Please fix the following issues.
First, make sure to install the required tools:
```
pip3 install pydocstyle
```
```
pip3 install ruff
```
Then complete the followings steps:
1. Run `pydocst…
-
Hello,
Currently I am trying to run qlora.py script with the 65B model on 2 A100 40GB GPUs with the script
```accelerate launch qlora.py --args```
with ```--args``` the ones given in the rep…