Llama 3.1 liger example is not working

Stealthwriter commented 2 months ago

Please check that this issue hasn't been reported before.

[X] I searched previous Bug Reports didn't find any similar reports.

Expected Behavior

it should train

Current behaviour

examples/llama-3/fft-8b-liger-fsdp.yaml

this example is not working optimizer: paged_adamw_8bit is not compatible with fsdp I tried changing it I still get this error: Value error, FSDP Offload not compatible with adamw_bnb_8bit

I commented out the fsdp settings and used deep speed it worked

Steps to reproduce

run example as is

Config yaml

No response

Possible solution

No response

Which Operating Systems are you using?

[X] Linux
[ ] macOS
[ ] Windows

Python Version

3.10

axolotl branch-commit

latest

Acknowledgements

[X] My issue title is concise, descriptive, and in title casing.
[X] I have searched the existing issues to make sure this bug has not been reported yet.
[X] I am using the latest version of axolotl.
[X] I have provided enough information for the maintainers to reproduce and diagnose the issue.

ganler commented 1 month ago

Seems not working for Mistral too.

Got the same error in https://github.com/linkedin/Liger-Kernel/issues/100 even if upgraded Liger

NanoCode012 commented 2 weeks ago

Hello! I could not reproduce this issue on current main. I ran this on 2xL40 which works (with edits below for dataset due to some new changes).

datasets:
  - path: mlabonne/FineTome-100k
    type: chat_template
    split: train[:20%]
+    field_messages: conversations
+    message_field_role: from
+    message_field_content: value

optimizer: adamw_torch

Could either of you clarify if you still see this issue?

winglian commented 2 days ago

the current version of the example should be correct now. 8-bit optimizers do not work with FSDP1, so you should use regular 32bit optimizers with FSDP

https://github.com/bitsandbytes-foundation/bitsandbytes/issues/89

axolotl-ai-cloud / axolotl