Hi,
I was implementing some models to do VQA and found your repo really useful. However, it seems like I can only get the FRCN_FEAT and BBOX_FEAT as image inputs to the model. Is there any way to take the original images as inputs and not the extracted features?
Hi, I was implementing some models to do VQA and found your repo really useful. However, it seems like I can only get the FRCN_FEAT and BBOX_FEAT as image inputs to the model. Is there any way to take the original images as inputs and not the extracted features?