Hi, Is it possible to change the region feature size from 2048 to 500 for finetuning captioning model? If yes, what should I change?
Alternatively is it ok to use padding with 0?
I would like to finetune the captioning model by changing the object detection model and using one that returns region features with size smaller than 2048.
Hi, Is it possible to change the region feature size from 2048 to 500 for finetuning captioning model? If yes, what should I change? Alternatively is it ok to use padding with 0?
I would like to finetune the captioning model by changing the object detection model and using one that returns region features with size smaller than 2048.
Thanks!