fanq15 / FewX

FewX is an open-source toolbox on top of Detectron2 for data-limited instance-level recognition tasks.
https://github.com/fanq15/FewX
MIT License
346 stars 48 forks source link

Dataset Generation #2

Closed XiongweiWu closed 4 years ago

XiongweiWu commented 4 years ago

Three questions:

  1. In 1_split_filter.py#L46-L48, to my point, sampled image should not contain objects in voc classes. However, this implementation seems only the image with tiny objects will be excluded;

  2. In 2_balance.py#L57, each category only contains no more than 80 instances?

  3. How to generate final_split_voc_10_shot_instances_train2017.json ?

fanq15 commented 4 years ago
  1. pick non-voc class
  2. 80 is the minimum instance number in each class
  3. You can use the given final_split_voc_10_shot_instances_train2017.json in the new_annotations dir for a fair comparison.
XiongweiWu commented 4 years ago

@fanq15

  1. So in your non-voc set, the images may also contain voc class instance (but not labeled) ?

  2. It seems that you first compute the total number of instance per class across all images stored in 'all_cls_dict', and then for each image, if one contained instance category number is less than 80 in 'all_cls_dict', then save all instances in this image for training, otherwise discard all the instances and remove the instances whose number is larger than 80. I am a bit confused about this file.

  3. Can u provide 30-shots json file?

fanq15 commented 4 years ago
  1. Yes. The voc instances are ignored.
  2. About the 2_balance.py: 2.1. Yes, it should be the instance number per class. I fixed the expression in the former answer. 2.2. There is a bug in the 2_balance.py and it actually does not balance the categories. But this bug does not affect the training and evaluation. I will fix this bug and see if the image balance can improve the performance.
  3. There is no 30-shot json file currently. I will add it later.