facebookresearch / maskrcnn-benchmark

Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.
MIT License
9.29k stars 2.5k forks source link

How to deal with empty image in my own datasets #144

Open wangjp0408 opened 5 years ago

wangjp0408 commented 5 years ago

❓ Questions and Help

I train my own dataset using a "sliding window" way, because the image is too large and have lots of objects in it(about 1000 per image). So in this way there's a problem: When the window sliding in a position where there's nothing(no one box) in it, i put "sub-region image in current window" into network, but how to deal with boxes,labels and masks? Would it work if i just set a box as [0,0,0,0] , and add field label as 0, and a mask with ......[0,0]? ..... Thanks for your help!

fmassa commented 5 years ago

Hi,

This is a great question!

The way I currently do it is to remove all empty images in the initialization of the dataset. If this is doable to you, then it would be great!

If this isn't possible, you could try hacking around in here to continue the training if any of the objects in the BoxList is empty.

Something like

if any(len(t) == 0 for t in targets):
    continue

but this will bump the iteration counter, which is not ideal but is a simple enough workaround. Does this work for you?

SupperSu commented 5 years ago

I am currently having the same problem, my dataset has two labels, have a disease or not, if not, then in the picture there are no bounding boxes. Here, I am just simply making x = 0, y = 0, x = 0, y = 0, labels as 0, and can train normally but I am not sure how it influence the performance.

fmassa commented 5 years ago

You might be degrading the performance of your model because you are assigning a positive box to something completely random.

We should define what we want to do in those cases: do we want to only sample the negative examples (and thus not have the positives in the batch)? This might bias the learning for those images, which might not be good.

I'd say really the better thing to do is to remove those images in the constructor of your dataset, for example as done in here in the COCODataset

wangjp0408 commented 5 years ago

Thanks for your reply ! And the first way you mentioned above works well for me. And I think it's easy to be done. Thanks a lot :)

SupperSu commented 5 years ago

Thank you for your response, I find someone uses the mask to predict bounding box, what they do is just generate a mask filled with zero, then if a pixel in bounding box then, mark this pixel as 1.

fmassa commented 5 years ago

@SupperSu I believe this is in the case where we don't have either masks or boxes, but have the other? Because we can infer a basic mask given a box, and a basic box given a mask? But what if we don't have neither boxes nor masks for a particular image?

SupperSu commented 5 years ago

In this case since the dataset only have the bounding boxes, and it is possible to infer the bounding box from mask and mask can handle the picture which have no bounding box, so they use mask to make prediction first, and then transfer the mask into bounding boxes.

fmassa commented 5 years ago

@SupperSu did you fix your issues then?

SupperSu commented 5 years ago

I am planning to use mask to infer bounding box, and I checked the code it seems that there is no such function, I think maybe I could add this function on my own. Currently, I followed your instruction that removing the pictures which have no bounding boxes.

fmassa commented 5 years ago

We don't currently have a functionality to get the bounding box for a mask, but it should be fairly easy to obtain something like that. For a 2d binary mask containing a single object, the bounding box can be inferred by something like:

mask = ...
idx_y, idx_x = mask.nonzero().unbind(1)

x1 = idx_x.min()
x2 = idx_x.max()
y1 = idx_y.min()
y2 = idx_y.max()