Open Tetsujinfr opened 7 years ago
is the pre-trained model trained on the IIT-AFF dataset ou quote in your paper, i.e. with the following classes?
If so, are yu planning on training it on other dataset? for instance VOC2012 or something with people in it? I do not have 11GB available to train the model on a dataset :(
thank you
Tets
Currently, we use threshold=0.9, if no box > 0.9, we choose the highest one - no matter how big the confidence is. You can change the param CONF_THRESHOLD = 0.9
to lower if you want to see more objects < 0.9 (really depends on the scene). You also may want to change/disable the part of code that choose the highest confidence box - line 154
- 159
in demo_img.py
Yes, we train AffordanceNet in IIT-AFF dataset, and the object class is as you posted. We can config AffordanceNet to train on Pascal VOC, but please note the mask in Pascal VOC is binary (i.e. background or foreground), and you will not see all the power of AffordanceNet. We design AffordanceNet to handle mutilclass in each object, not only binary.
I'll release a smaller version of the net soon, so you can train in any dataset (that has any objects) you want.
@nqanh Thanks for you share! Very nice work,but how to make .sm file? Can you also share a example then i can make my own data?
Thanks for your interest @felixfuu! I just added the utils folder. You can find the script to create .sm files and all relevant information to train AffordanceNet on your own data.
@nqanh Thanks for you reply!Another question,how to ensure the mask which is very small can be detected?
The affordance mask depends on the size of the object, and the object size depends on the anchor parameters (scale and ratio) of the object detector. The concept of anchor was proposed in the Faster R-CNN paper, here we use 15 anchors as in Mask R-CNN paper.
If you want to detect very small objects (e.g, 5x5), you should change the params related to anchor in the prototxt file. You can play with Faster RCNN first before doing for AffordanceNet, because if the object detector fails, then the mask branch will fail.
@nqanh Ok ,thanks! The Mask RCNN use FPN as feature extra model , but in this project,i can't fint the implement of FPN,is there any information about FPN?
No, we use VGG16 backbone to extract features and mainly focus on the mask branch (for multiclass affordances). The object detection branch is quite simple (only 2 fully connected layers are used). You can extend AffordanceNet with ResNet and FPN.
hi I have some low confidence message popping up (message and related image below) or some classes being undetected in many images, e.g. people.
What classes did you train the provided pre-trained model with please?
Tets