Closed tianzhaohaha closed 9 months ago
Also, many model's path was added the hugging_cache/, is it means I should download the corresponding model by myself?
Thanks for the reply. But sorry I am still little confused about these hparams. Could you kindly give me a example for MEND/minigpt4.yaml
I think it just need a little change?
Too many confused hparams, this is my current minigpt4.yaml, could someone help me fix it?
device: 0 alg_name: "MEND" name: lmsys/vicuna-7b-v1.5 model_name: minigpt4 model_class: Blip2OPT tokenizer_class: LlamaTokenizer tokenizer_name: lmsys/vicuna-7b-v1.5 inner_params:
alg: MEND lr: 1e-6 edit_lr: 1e-4 lr_lr: 1e-4 lr_scale: 1.0 seed: 42 cedit: 0.1 iedit: 0.1 cloc: 1.0 cbase: 1.0 dropout: 0.0 train_base: False no_grad_layers: null one_sided: False n_hidden: 1 hidden_dim: null init: id norm: True combine: True x_only: False delta_only: False act: relu rank: 1920 mlp_class: IDMLP shared: True archive: results/models/MEND/minigpt4-vqa
batch_size: 1 model_save_pt: 5000 silent: False
max_iters: 50000 log_interval: 100 eval_log_interval: 1000 final_eval: True val_interval: 5000 early_stop_patience: 20000 early_stop_key: "loss/total_edit_val" eval_only: True half: False debug: False save: False verbose: True
val_batch_size: 1 accumulate_bs: 2 val_steps: 500 # only for debug opt: Adam grad_clip: 100.
results_dir: ./results
qformer_checkpoint: hugging_cache/pretrained_minigpt4_llama2_7b.pth qformer_name_or_path: bert-base-uncased state_dict_file: hugging_cache/eva_vit_g.pth pretrained_ckpt: hugging_cache/pretrained_minigpt4_llama2_7b.pth
coco_image: ../ rephrase_image: ../
now, I got the Error: RuntimeError: Error(s) in loading state_dict for MiniGPT4: size mismatch for llama_proj.weight: copying a param with shape torch.Size([4096, 5632]) from checkpoint, the shape in current model is torch.Size([4096, 768]).
I think the name and the tokenizer_name may be not correct?
You have incorrectly configured the qformer_checkpoint
and pretrained_ckpt
settings, deviating from the original repository's guidelines. Please refer to the Multimodal section in this file for the correct settings.
Feel free to specify any points of confusion so that we can optimize and provide clearer guidance in the future.
Really appreciate for your clarification!
Another question, what is the qformer_checkpoint: hugging_cache/blip2_pretrained_opt2.7b.pth
in blip2.yaml
? Do I need to run Blip2 first to get the corresponding pre-trained model before I can run your code? Actually. I try to save the blip2 as pth file but your code said format mismatch like KeyError: 'model'
Thank you for your patient guidance. I am new to this repo, the cost of reproducing your code is too high for me. Could you please just provide the correct yaml file(not example) with coresponding models? You code is not support a wide range of models at least on multimodel part, so I don't think it's a good idea to ask researchers themselves to find these models.(Just an advise)
Still in Error... RuntimeError: Error(s) in loading state_dict for Blip2OPT: size mismatch for opt_proj.weight: copying a param with shape torch.Size([2560, 768]) from checkpoint, the shape in current model is torch.Size([768, 768]). size mismatch for opt_proj.bias: copying a param with shape torch.Size([2560]) from checkpoint, the shape in current model is torch.Size([768]).
Bellow is my blip2.yaml:
device: 1 alg_name: "MEND" name: Salesforce/blip2-opt-2.7b model_name: blip2 model_class: Blip2OPT tokenizer_class: GPT2Tokenizer tokenizer_name: Salesforce/blip2-opt-2.7b inner_params:
alg: MEND lr: 1e-6 edit_lr: 1e-4 lr_lr: 1e-4 lr_scale: 1.0 seed: 42 cedit: 0.1 iedit: 0.1 cloc: 1.0 cbase: 1.0 dropout: 0.0 train_base: False no_grad_layers: null one_sided: False n_hidden: 1 hidden_dim: null init: id norm: True combine: True x_only: False delta_only: False act: relu rank: 1920 mlp_class: IDMLP shared: True archive: results/models/MEND/blip2
batch_size: 1 model_save_pt: 5000 silent: False
max_iters: 50000 log_interval: 100 eval_log_interval: 1000 final_eval: True val_interval: 5000 early_stop_patience: 20000 early_stop_key: "loss/total_edit_val" eval_only: True half: False debug: False save: False verbose: True
val_batch_size: 1 accumulate_bs: 2 val_steps: 500 # only for debug opt: Adam grad_clip: 100.
results_dir: ./results
qformer_checkpoint: hugging_cache/blip2_pretrained_opt2.7b.pth qformer_name_or_path: bert-base-uncased state_dict_file: hugging_cache/eva_vit_g.pth
coco_image: ../ rephrase_image: ../
Thank you for your feedback.
You can follow the config file where model_name
and tokenizer_name
use opt-2.7b
instead of blip2-opt-2.7b
. And I guess you didn't run a trainer, so you should configure MEND
following hparams/TRAINING/MEND/blip2.yaml
and refer to the example of using it provided here.
Hi, have you solved your issue yet?
Actually, I just want to run your example: EasyEdit_Example_Multimodal_IKE.ipynb. Still, I dont know which opt-2.7b should I set, hugging_cache is just a empty folder, so I think I should download the model from huggingface first for the model_name and tokenizer_name. If I set hugging_cache/opt-2.7b, then it will be error:
OSError: hugging_cache/opt-2.7b is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with use_auth_token
or log in with huggingface-cli login
and pass use_auth_token=True
.
Thank you for the clarification. You can set model_name
and tokenizer_name
as facebook/opt-2.7b
for convenience, and I'll take note of the hugging_cache
as our local folder for manually downloaded models from Hugging Face.
Certainly! If you have any more questions or need further assistance, I can reach out to you on WeChat using the provided username YouKn0wWho
for convenient communication.
Hi, I try to run the example of the multimodel_IKE model edit method which use minigpt4.yaml hparams, but it rised RuntimeError: checkpoint url or path is invalid. I found that it might be the issues of
qformer_checkpoint: hugging_cache/blip2_pretrained_flant5xxl.pth
in minigpt4.yaml. Which url should I set to excute the example here?