YoungXIAO13 / FewShotDetection

(ECCV 2020) PyTorch implementation of paper "Few-Shot Object Detection and Viewpoint Estimation for Objects in the Wild"
http://imagine.enpc.fr/~xiaoy/FSDetView/
MIT License
211 stars 33 forks source link

Some questions about "Class Data" ? #9

Closed HuangLian126 closed 3 years ago

HuangLian126 commented 3 years ago

Thanks for your work, however I have a question: in each iteration, do you sample 15 different objects for 15 base categories as the data of support set? In other words, should we sample 60 different objects for 60 base categories as the data of support set? I don not have the GPU with enough memory. Can you give me some suggestions?

YoungXIAO13 commented 3 years ago

For each iteration we will select one class data for each training categories, so in the stage of base-training, there would be 15 different objects for PASCAL and 60 for COCO, then 20 for PASCAL and 80 for COCO in the stage of few-shot finetuning.

The GPU memory is indeed an issue when training big network on large data, one work-around could be reduce the batch size and re-scale the learning rate accordingly (cf here)

HuangLian126 commented 3 years ago

In the Figure 3 of Paper "Meta R-CNN : Towards General Solver for Instance-level Low-shot Learning" writes "The illustrative instance of RoI meta-learning process in Meta R-CNN. Suppose the image Faster/ Mask R-CNN receiving contains objects in “person”, “horse”. Then Cmeta = ( “person”, “horse” ) and Dmeta includes K-shot “person” and ”horse” images with their structure labels, respectively. As the training image iteratively changes, Cmeta and Dmeta would adaptively change.". In the stage of base-training, the sampled objects of support set adaptively change based on the objects of query set.

However, why their source codes don not correspond with their paper? May I get it wrong?

YoungXIAO13 commented 3 years ago

Actually this is just an illustration, the support data does not depend on the query data, and it should not. All the support data is generated before training and they are iterated recursively by a meta_data_loader, whose order is independent to the query_data_loader.

HuangLian126 commented 3 years ago

Thanks for your reply. I can only run the code of meta_data_loader since I cannot build the roi_align/roi_crop/roi_pooling. I don not want to change the version of CUDA(10.0), and try to reproduce your model based on the MMDetection. It is very laborious due to the abstract programming framework. It is much better if your code supports CUDA10.0.