Open Unrealluver opened 1 year ago
Hi @Unrealluver, thank you for your feedback. I run a test training process locally and do notice this performance drop using the latest code release. However, when I re-run the commit that I got the results during the development, I was able to reproduce the results. I am investigating this currently, and will let you know soon.
Sorry about this confusion in the released code, and thank you again for the feedback.
@haotian-liu Thanks, I am waiting for your further reply.
Hi @Unrealluver, it has been fixed now. There was an index not updated which caused the datasets with mixed image-text/text-only content having issues. Please pull the latest code base and it should work now. Thanks!
Please also re-download this checkpoint, thanks. I used this checkpoint to verify the ScienceQA finetuning, not sure why I uploaded the wrong version. Sorry for the confusion again.
Question
Thanks for running the script for SQA finetuning. After the fine-tuning of SQA. I find the result for the SQA test set (Total: 4241, Correct: 3670, Accuracy: 86.54%) is not good as the result reported in the paper (Accuracy: 90.92%). Could you please share some advice for fixing the mismatch reproduced results?
Here are my running scripts:
The
--model_name_or_path /share/project/lianghuizhu/vicuna-13b-v0
is the checkpoint that applies the official vicuna delta on LLaMA-13b.--pretrain_mm_mlp_adapter /home/zhulianghui/ProjectC_ChatGPT/llava/reference/LLaVA-13b-pretrain-projector-v0-CC3M-595K-original_caption-no_im_token.bin
is the projection layer provided in this repo that does not contains im token.At last, I run the multi-gpu generation scripts in this repo to generate and gather the results.