Luodian / Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
https://otter-ntu.github.io/
MIT License
3.55k stars 242 forks source link

[Fix/Train/Model] debug the fp16 scale issue for loss backward #195

Closed ZhangYuanhan-AI closed 1 year ago

ZhangYuanhan-AI commented 1 year ago
  1. debug the fp16 scale issue for loss backward
  2. debug the label mask problem (mask for the whole batch, instead of only the first item in the batch)