Change the size of the region features

Hi, Is it possible to change the region feature size from 2048 to 500 for finetuning captioning model? If yes, what should I change? Alternatively is it ok to use padding with 0?

I would like to finetune the captioning model by changing the object detection model and using one that returns region features with size smaller than 2048.

Thanks!