OpenGVLab / LAMM

[NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents
https://openlamm.github.io/
286 stars 15 forks source link

the save_model func in src/model/agent.py may cause some bugs when using deepspeed ZeRO3 #32

Closed lighten001 closed 1 year ago

lighten001 commented 1 year ago

here is the plan to fix it:

  1. only save model in Rank 0 process
  2. for ZeRO3, consolidate weights in Rank 0 before saving trainable parameters
OpenLAMM commented 1 year ago

Fixed in PR #33