Luodian / Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
https://otter-ntu.github.io/
MIT License
3.55k stars 242 forks source link

[Feat/Model/Train] updates for Flamingo's lightweight pre-training. #222

Closed Luodian closed 1 year ago

Luodian commented 1 year ago
  1. DeepSpeed ZeRo2 Integration + DDP Training
    • 5-6x speed improvement in pretraining/instruction-tuning compared to our previous FSDP pipeline.
  2. Support Flamingo pretraining on LIAON400M/CC3M.
    • Now you can train your lightweight flamingo model on smaller datasets.
  3. Add LoRA support for tuning LLM decoder.
    • Yes, we have LoRA too. You can customize your model better in specific scenarios.
  4. Support cross usage for Otter and Flamingo models.
    • Yes, you could load Flamingo weights directly to Otter models, and vice versa.