Closed Stealthwriter closed 2 days ago
Seems not working for Mistral too.
Got the same error in https://github.com/linkedin/Liger-Kernel/issues/100 even if upgraded Liger
Hello! I could not reproduce this issue on current main. I ran this on 2xL40 which works (with edits below for dataset due to some new changes).
datasets:
- path: mlabonne/FineTome-100k
type: chat_template
split: train[:20%]
+ field_messages: conversations
+ message_field_role: from
+ message_field_content: value
optimizer: adamw_torch
Could either of you clarify if you still see this issue?
the current version of the example should be correct now. 8-bit optimizers do not work with FSDP1, so you should use regular 32bit optimizers with FSDP
https://github.com/bitsandbytes-foundation/bitsandbytes/issues/89
Please check that this issue hasn't been reported before.
Expected Behavior
it should train
Current behaviour
examples/llama-3/fft-8b-liger-fsdp.yaml
this example is not working optimizer: paged_adamw_8bit is not compatible with fsdp I tried changing it I still get this error: Value error, FSDP Offload not compatible with adamw_bnb_8bit
I commented out the fsdp settings and used deep speed it worked
Steps to reproduce
run example as is
Config yaml
No response
Possible solution
No response
Which Operating Systems are you using?
Python Version
3.10
axolotl branch-commit
latest
Acknowledgements