pytorch / torchtune

PyTorch native finetuning library
https://pytorch.org/torchtune/main/
BSD 3-Clause "New" or "Revised" License
4.33k stars 438 forks source link

QLoRA + FSDP #1029

Closed jeromeku closed 4 months ago

jeromeku commented 5 months ago

Is it possible to run QLoRA finetuning on more than a single device? I don't see any configs for QLoRA other than for single_device.

If not, what are the gating issues? More generally, what methods need to be implemented for custom tensor types (e.g., NF4) in order to compose with FSDP or other distributed training methods?

ebsmothers commented 5 months ago

Hi @jeromeku thanks for creating the issue. We don't yet support QLoRA on multiple devices, but this is a high priority and we're hoping to have this working very soon! There is active work going on to enable this, you can see the PR #909.

felipemello1 commented 4 months ago

Closing it since #909 was merged. You will need pytorch nightlies to run it. Config here: https://github.com/pytorch/torchtune/blob/6f37d15b2c99d49ca926173455569aa6f8e24d9d/recipes/configs/llama3/70B_full.yaml#L9

Please feel free to reopen the issue if you still have questions. Thanks! :)