DD-DuDa / BitDistiller

[ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.
MIT License
85 stars 10 forks source link

about device #9

Open pwd11 opened 6 days ago

pwd11 commented 6 days ago

If I want to replicate your experiment, will three 4090 graphics cards suffice?

DD-DuDa commented 2 days ago

If you are runing 7b models, I thinks it would be ok. Just try to alter the deepspeed config.