EasonXiao-888 / GrootVL

[NeurIPS2024 Spotlight] The official implementation of GrootVL: Tree Topology is All You Need in State Space Model
84 stars 2 forks source link

Global batch size for the GrootL instruction-tuning #11

Open zhan8855 opened 1 month ago

zhan8855 commented 1 month ago

Could you please tell me how many GPUs are used for GrootL instruction-tuning? Thank you so much for your awesome work!

EasonXiao-888 commented 3 weeks ago

Thanks for your interest! By default, we utilize 8 A100 for SFT.

zhan8855 commented 2 weeks ago

Thanks!