How to run the otter demo with my local model weight?

Luodian / Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

https://otter-ntu.github.io/

MIT License

3.55k stars 242 forks source link

How to run the otter demo with my local model weight? #207

Closed Hongbin98 closed 1 year ago

Hongbin98 commented 1 year ago

[y] A descriptive title: How to run the otter demo with my local model weight?
[y] A detailed description
[y] Assign an issue type tag (label):
- evaluation (evaluation result, performance of Otter etc.),

For example, the model is loaded by the following command from hugging face: model = OtterForConditionalGeneration.from_pretrained("luodian/OTTER-9B-LA-InContext", device_map="sequential", **precision)

After training, I got my weights located on './Otter-main/OTTER-MPT7B-densecaption'.

My question is how to load my local weight and perform the otter demo? --I guess like model = TempModel(); model.load_state_dict(torch.load(file_path)) Also, I would like to show my gratitude for your reply on recent days and cite your research in the future :)

Luodian commented 1 year ago

You can just replace the luodian/xxxxx to your local folder path like /home/xxxx.

note that it's a folder path (Huggingface format model usually needs a folder to load, it would load the checkpoints inside the folder according to "config.json" inside the foler).

Hongbin98 commented 1 year ago

You can just replace the luodian/xxxxx to your local folder path like /home/xxxx.

note that it's a folder path (Huggingface format model usually needs a folder to load, it would load the checkpoints inside the folder according to "config.json" inside the foler).

I ever tried to do this, but got an error. running: model = OtterForConditionalGeneration.from_pretrained("./Otter-main/OTTER-MPT7B-densecaption", device_map="auto"). And there are files in this output dir, including 'config.json', 'final_weights.pt' and others.

Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory ./Otter-main/OTTER-MPT7B-densecaption.

Luodian commented 1 year ago

It may be a problem in folder structure.

Inside the foler there should be a "config.json" and multiple "pytorch-0001/0004.bin"

You can check our huggingface repo to see the hf format folder.

You may check if you correctly saved your model in correct format.

There's a --save_hf_model option for saving correct hf model.

Luodian commented 1 year ago

You may also refer the https://github.com/Luodian/Otter/blob/main/otter/flamingo_pt2otter_hf.py

This script is used to convert a checkpoint final_weights.pt to a correct hf format folder.

The OtterForConditionalGeneration is a standard way to launch hf model, it takes the hf model folder as input~

Hongbin98 commented 1 year ago

Following the suggestion of @Luodian, I have solved this issue.

If anyone following the recommended training command to train otter and cannot load the checkpoint, I hope the following steps could offer you a hand~

run converting_otter_pt_to_hf.py to convert your local weight to the hugging face type, and save in a new directory, like ./OTTER-MPT7B-densecaption-tohf
then, change the path of OtterForConditionalGeneration.from_pretrained() the of the demo file, i.e., luodian/xxxxx to ./OTTER-MPT7B-densecaption-tohf
run the demo and enjoy it!

In my opinion, a better choice to solve this issue is to add the --save_hf_model option when you train otter.