ammirato / target_driven_instance_detection

MIT License
33 stars 10 forks source link

few-shot detection #4

Closed asdfqwer2015 closed 5 years ago

asdfqwer2015 commented 5 years ago

Hi: Your model is very interesting. Thanks for your shared code. Why not add few-shot instance detection implements as in paper into the project? That might be more useful in practice. Can the model used for instance with the class from imagenet in test? Thanks.

asdfqwer2015 commented 5 years ago

En, to generalize to other classes like voc classes, will it work if we further construct a dataset from tracking task dataset e.g. trackingnet? For training phase, choost the instance in the first frame as 'target'(the instance in the first frame may be clearly) and sample a frames in each 20~30 frames as 'scene' to train the model? The constructed dataset will have more classes and more instance for each class. Is it available? Thanks again.

ammirato commented 5 years ago

Hi, thanks for you comments. We may be adding the few-shot data and code in the future.

For the tracking scenario, this may be possible. I'd encourage you to look at some similar work to ours in tracking, DaSiamRPN. They do not release training code, but reading the paper may give some insights. It seems similar to what you are describing.

asdfqwer2015 commented 5 years ago

En, thanks for your reply and your recommend related paper. Yes, I mean the generated dataset similar with DaSiamRPN's. Do you think if trained with a dataset like this, the model can predict a instance with a different class concept from AVD, e.g. bicycle or animals?

Hope to see the few-shot data and code. :)

ammirato commented 5 years ago

The DaSiamRPN method is very similar to ours, so I would expect them to work similarly. The main difference is in the size of the target image. In the tracking SiamRPNs, they have a strong prior on the size of the object does not change much frame to frame. For detection TDID pools the target to 1x1, there is no knowledge of what scale the target object is in the scene image. I would expect the tracking methods to work better on tracking and TDID to work better on detection.

asdfqwer2015 commented 5 years ago

En, same datasets with different tasks. Hope to see the expected results.

asdfqwer2015 commented 5 years ago

If I want to train the model with the trackingnet, should I a. train with trackingnet and then finetune with AVD or b. merge trackingnet and avd, and then train with the merged dataset? Thanks. @ammirato