SamsungLabs / fbrs_interactive_segmentation

[CVPR2020] f-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation https://arxiv.org/abs/2001.10331
Mozilla Public License 2.0
583 stars 94 forks source link

multi object segmentation #28

Closed shiragit closed 3 years ago

shiragit commented 3 years ago

Hi, does this model supports multi object segmentation in a single frame? 10x

chamecall commented 3 years ago

++

chamecall commented 3 years ago

Hi, does this model supports multi object segmentation in a single frame? 10x

@shiragit , Have you achieved a solution to the problem of segmenting multiple objects

ptrvilya commented 3 years ago

Hi! The model supports segmentation of multiple objects, though it's an unnatural use case. The model was trained on at most two merged objects masks (e.g. these lines), so selecting each object separately should work better. The interface for multiple objects segmentation on one image is implemented in our demo: "finish object" button adds the current mask to all instance masks for the current image and resets the model so you can start a new prediction from scratch.

chamecall commented 3 years ago

@ptrvilya, do you mean it's possible to get segmentations for multiple objects in one forward pass, not sequentially for each object? As I saw in your demo you don't pass object points of previous objects (object that was selected before clicking "finish object") but only current one so for each object you need to re-run network.

chamecall commented 3 years ago

@ptrvilya, maybe there are some ways to increase performance in comparison with the method when we sequentially pass the points of each object to the network?

ptrvilya commented 3 years ago

@chamecall

do you mean it's possible to get segmentation for multiple objects in one forward pass, not sequentially for each object? As I saw in your demo you don't pass object points of previous objects (object that was selected before clicking "finish object") but only current one so for each object you need to re-run network.

No, if you want to predict separate masks for different objects you have to re-run the network.

maybe there are some ways to increase performance in comparison with the method when we sequentially pass the points of each object to the network?

Theoretically you can combine clicks for different objects and perform a batched forward pass, but in that case you should have crops of exact same size so the Zoom-In implementation is to be changed. If you need to interact with multiple objects simultaneously you can also implement some sort of context switching, i.e. store runtime parameters for multiple objects and switch between them when the user changes an object.