broadinstitute / keras-rcnn

Keras package for region-based convolutional neural networks (RCNNs)
Other
555 stars 222 forks source link

R-CNN API design #28

Closed JihongJu closed 7 years ago

JihongJu commented 7 years ago

I think it is the time to think about how the R-CNN API should look like. We have discussed it a bit in broadinstitute/keras-rcnn#7, but there it is more about the structure than the API.

From my understanding, a R-CNN framework should include a body, that predicts ROIs from images, and a couple of heads, that predicts scores, bounding boxes (and masks).

The idea is to easily attach different bodies, ResNets, (VGG, FPN etc.), and choose to include/exclude the mask head.

Do you have an elegant way of doing this in minds? @0x00b1 @jhung0

0x00b1 commented 7 years ago

Elegant? No. 😆

My initial thinking was similar to your description. I imagined a few subclasses of Keras’s model class, e.g. an RPN model, an RCNN model, and a MaskRCNN model.

However, I think there’d be some benefit of something more modular. @mcquin has been thinking about this problem. I’d love to hear her thoughts.

I’ve also considered reaching out to @KaimingHe, @rbgirshick, @pdollar, etc. and asking for feedback. I’d especially like to hear feedback from @rbgirshick about his current thinking around RCNN implementation after writing and maintaining py-faster-rcnn for the past year or two.

@JihongJu Are you planning on attending CVPR? A few of us will be attending (@jhung0 and I are attending), so maybe we can talk about future plans (I assume some of the aforementioned RCNN authors are attending too). SciPy is another option (@mcquin is attending). Alternatively, we could schedule a Skype call for anybody that’s interested.

When we (@jhung0, @mcquin, @AnneCarpenter, and I) first started discussing this , I wrote the following simple description in my notebook:

Keras-RCNN is a framework for solving image segmentation and object detection problems.

I still think this captures what I hope this becomes (rather than the application approach provided by py-faster-rcnn or yolo). I imagine us trying to stay state-of-the-art (inside the scope of RCNN) but enabling users to mix-and-match components (like a Mask-RCNN branch).

JihongJu commented 7 years ago

@0x00b1 Unfortunately, I am not attending CVPR this year. A Skype call would work the best for me because I am now located in Europe.

I still think this captures what I hope this becomes (rather than the application approach provided by py-faster-rcnn or yolo). I imagine us trying to stay state-of-the-art (inside the scope of RCNN) but enabling users to mix-and-match components (like a Mask-RCNN branch).

I totally agree with the components mix-and-match pattern. That is also what I have in mind.