Open wangjp0408 opened 5 years ago
Hi,
This is a great question!
The way I currently do it is to remove all empty images in the initialization of the dataset. If this is doable to you, then it would be great!
If this isn't possible, you could try hacking around in here to continue the training if any of the objects in the BoxList is empty.
Something like
if any(len(t) == 0 for t in targets):
continue
but this will bump the iteration counter, which is not ideal but is a simple enough workaround. Does this work for you?
I am currently having the same problem, my dataset has two labels, have a disease or not, if not, then in the picture there are no bounding boxes. Here, I am just simply making x = 0, y = 0, x = 0, y = 0, labels as 0, and can train normally but I am not sure how it influence the performance.
You might be degrading the performance of your model because you are assigning a positive box to something completely random.
We should define what we want to do in those cases: do we want to only sample the negative examples (and thus not have the positives in the batch)? This might bias the learning for those images, which might not be good.
I'd say really the better thing to do is to remove those images in the constructor of your dataset, for example as done in here in the COCODataset
Thanks for your reply ! And the first way you mentioned above works well for me. And I think it's easy to be done. Thanks a lot :)
Thank you for your response, I find someone uses the mask to predict bounding box, what they do is just generate a mask filled with zero, then if a pixel in bounding box then, mark this pixel as 1.
@SupperSu I believe this is in the case where we don't have either masks or boxes, but have the other? Because we can infer a basic mask given a box, and a basic box given a mask? But what if we don't have neither boxes nor masks for a particular image?
In this case since the dataset only have the bounding boxes, and it is possible to infer the bounding box from mask and mask can handle the picture which have no bounding box, so they use mask to make prediction first, and then transfer the mask into bounding boxes.
@SupperSu did you fix your issues then?
I am planning to use mask to infer bounding box, and I checked the code it seems that there is no such function, I think maybe I could add this function on my own. Currently, I followed your instruction that removing the pictures which have no bounding boxes.
We don't currently have a functionality to get the bounding box for a mask, but it should be fairly easy to obtain something like that. For a 2d binary mask containing a single object, the bounding box can be inferred by something like:
mask = ...
idx_y, idx_x = mask.nonzero().unbind(1)
x1 = idx_x.min()
x2 = idx_x.max()
y1 = idx_y.min()
y2 = idx_y.max()
❓ Questions and Help
I train my own dataset using a "sliding window" way, because the image is too large and have lots of objects in it(about 1000 per image). So in this way there's a problem: When the window sliding in a position where there's nothing(no one box) in it, i put "sub-region image in current window" into network, but how to deal with boxes,labels and masks? Would it work if i just set a box as [0,0,0,0] , and add field label as 0, and a mask with ......[0,0]? ..... Thanks for your help!