yangbang18 / MultiCapCLIP

(ACL'2023) MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
BSD 3-Clause "New" or "Revised" License
35 stars 1 forks source link

What is the training and evaluation process of "finetune finetune_fewshot"? #4

Closed tiesanguaixia closed 1 year ago

tiesanguaixia commented 1 year ago

Thank you for the great work! In the Main block, I find the command like: bash scripts/pipe.sh coco baseline "finetune finetune_fewshot". What is the training and evaluation process of it? For the command bash scripts/pipe.sh coco baseline "finetune_fewshot", I think it trains the baseline model on a certain portion of vision-text pairs of the training set and evaluate the model on val/test pairs. Is this correct? If so, when the task is "finetune finetune_fewshot", why the model will undergo 2 training processes on 100% and a small portion of vision-text pairs successively? Perhaps there is a deviation in my understanding. Thank you for your guidance!

yangbang18 commented 1 year ago

Q1: For the command bash scripts/pipe.sh coco baseline "finetune_fewshot", I think it trains the baseline model on a certain portion of vision-text pairs of the training set and evaluate the model on val/test pairs. Is this correct? A1: Yes it is.

Q2. When the task is "finetune finetune_fewshot", why the model will undergo 2 training processes on 100% and a small portion of vision-text pairs successively? A2. Thank you for pointing out the problem, finetune_fewshot is not needed in the main block, but is required for comparison in the semi-supervised setting. I will update the command in README.md.

tiesanguaixia commented 1 year ago

Q1: For the command bash scripts/pipe.sh coco baseline "finetune_fewshot", I think it trains the baseline model on a certain portion of vision-text pairs of the training set and evaluate the model on val/test pairs. Is this correct? A1: Yes it is.

Q2. When the task is "finetune finetune_fewshot", why the model will undergo 2 training processes on 100% and a small portion of vision-text pairs successively? A2. Thank you for pointing out the problem, finetune_fewshot is not needed in the main block, but is required for comparison in the semi-supervised setting. I will update the command in README.md.

OK, I got it. Thanks a lot!