Luodian / Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
https://otter-ntu.github.io/
MIT License
3.56k stars 241 forks source link

Multiple issues with OtterHD #315

Closed StrangeTcy closed 10 months ago

StrangeTcy commented 10 months ago

https://github.com/Luodian/Otter/blob/main/docs/OtterHD.md as it stands now at the time of the writing of this issue is a weird document:

  1. the Technical report link leads to a file that doesn't exist (https://github.com/Luodian/Otter/blob/main/docs/link)
  2. the Demo leads to a huggingface page, which has a link to Checkpoints, which don't exist at the moment (see https://github.com/Luodian/Otter/issues/303 as well)
  3. there's for now only a finetuning script for anyone wishing to replicate your finetuning efforts, I guess, but no inference script (probably due to 2.)

Hope the checkpoints get released soon!

Luodian commented 10 months ago

hi I fixed issue 1 already and for issue 2, the ETA is approximately early Dec, no later than late Dec. We are doing some experiments on resizing vs. padding, it will take some efforts to finish and provide the best checkpoint for both good demo & benchmark performance.

We've updated a recent trained model on demo, you could take a look at: https://huggingface.co/spaces/Otter-AI/OtterHD-Demo

As for issue 3. As for inference script, you can take a look at following file. It already has a defined OtterHD class. https://github.com/Luodian/Otter/blob/main/pipeline/benchmarks/models/otterhd.py

It's basically the same as Fuyu's: https://github.com/Luodian/Otter/blob/main/pipeline/benchmarks/models/fuyu.py

StrangeTcy commented 9 months ago

At the time of me writing this it's December 18th. Is that late enough? Do you have any updates on the ETA?

A link to your wandb page would be super cool, so that we can all see how the model is doing