shilrley6 / Faster-R-CNN-with-model-pretrained-on-Visual-Genome

Faster RCNN model in Pytorch version, pretrained on the Visual Genome with ResNet 101
231 stars 59 forks source link

Extract model weights for other tasks #2

Open nilinykh opened 4 years ago

nilinykh commented 4 years ago

Hello!

Great work! Was this model trained for classification? Not sure, but if it was trained for some task, then it should contain linear layers, pooling layers, which can be removed if I want to apply this model to some other task.

So, could you please provide some information about using this model weights for some other task? I would like to use it for image captioning, so that would be great to load its weights only without any task-specific layers.

zhuocai commented 4 years ago

Hello!

Great work! Was this model trained for classification? Not sure, but if it was trained for some task, then it should contain linear layers, pooling layers, which can be removed if I want to apply this model to some other task.

So, could you please provide some information about using this model weights for some other task? I would like to use it for image captioning, so that would be great to load its weights only without any task-specific layers.

This model is trained for object detection. For tasks like image captioning and visual question answering, you can use the output of generate_tsv.py directly. Thanks to the author, you don't need to train a faster-rcnn to get image features for these two downstream tasks.