TencentARC / SmartEdit

Official code of SmartEdit [CVPR-2024 Highlight]
246 stars 8 forks source link

Question about the llava ckpt when inference #5

Closed cocoshe closed 6 months ago

cocoshe commented 6 months ago

Thanks for you great work! And I wanna ask the LLaVA-13B-v1 in readme script

python test/DS_SmartEdit_test.py --is_reasoning_scenes True --model_name_or_path "./checkpoints/vicuna-13b-v1-1" --LLaVA_model_path "./checkpoints/LLaVA-13B-v1" --save_dir './checkpoints/SmartEdit-13B/Reason-15000' --steps 15000 --total_dir "./checkpoints/SmartEdit-13B" --sd_qformer_version "v1.1-13b" --resize_resolution 256

Is it produced here? Use the base llava weight with a delta weight, and use a script to merge the weights (that's how the instruction tell us to do).

I'm not so sure about this because we actually don't need to do such a merge process in newer versions of llava, 1.5, 1.6 for example.

And if we change the ckpt to llava-1.6, is it feasible?

yuzhou914 commented 6 months ago

Yes, since original llava weight is delta weight because of the base LLM's model license, so you need to firstly convert the delta weight. You could firstly follow the llava instruction to prepare the llava ckpt, and inference the llava conversation to make sure the conversion is correct (the llava conversation is correct). By the way, please use llava-1.1-7b/13b and do not use other llava version to inference SmartEdit, since all training processes are undergone with llava-1.1.

cocoshe commented 6 months ago

Yes, since original llava weight is delta weight because of the base LLM's model license, so you need to firstly convert the delta weight. You could firstly follow the llava instruction to prepare the llava ckpt, and inference the llava conversation to make sure the conversion is correct (the llava conversation is correct). By the way, please use llava-1.1-7b/13b and do not use other llava version to inference SmartEdit, since all training processes are undergone with llava-1.1.

OK, thanks for your reply!