SamsungLabs / ritm_interactive_segmentation

Reviving Iterative Training with Mask Guidance for Interactive Segmentation
MIT License
626 stars 125 forks source link

Multi object segmentation in one network pass #4

Open chamecall opened 3 years ago

chamecall commented 3 years ago

Hello. Thanks for you work. Could the project be adjusted for multi object selection? By multi object selection I mean a case when we set clicks not only as foreground and background but clicks for object1, object2 and so on and to get a segmentations for the objects as a result.

Most important factor is to segment N objects in one network pass cause I can have a large number of objects and sequential passes for every object in image may result in a big time delay.

If multi object segmentation is not possible in your current implementation then: 1) is that possible to train your model for that purpose or the architecture cannot solve the case problem? 2) probable you can be acknowledged about network/project which solves my use case? if so could you share the info? Thanks again.

ksofiyuk commented 3 years ago

Hello. Yes, it is possible to adapt the code to enable one-pass multi object selection. But it is not very simple and requires rewriting some code.

You should modify the model so that it takes BS x N x P x 2 points (N objects, P is a maximum number of positive/negative points) or BS x N x 2 disk maps. Currently it takes BS x P x 2 points or BS x 2 disk maps for each sample in a batch. To train this model you need to choose N objects for each sample in a batch and replace the NFL loss function with Softmax+CE or use N separate NFL losses for each object (one-vs.-rest strategy).