Open de-vri-es opened 6 years ago
A few thoughts:
I think it doesn't make sense to treat an annotation that is clipped by 5% the same way as one that is clipped by 95% (dropping an annotation that is clipped by 5% seems very odd to me, and is probably a source of errors in training).
Clipping annotations introduces errors, since objects might become impossible to recognize from a small crop. For example, if a car annotation is clipped such that only a tire remains visible, there is no way to know whether it should be classified as car or truck.
Annotations are set as don't care
when they are assigned to an anchor box with 0.4 < IoU < 0.5 so that they do not contribute to the loss. I think it makes a lot of sense to use a similar procedure here, so that objects that are cropped can be set to not contribute to the loss either way.
What would you say about adjusting the canvas such that the entire image still fits? It would mean that all annotations remain valid, but also that rotations apply a sort of scaling since the transformed image is resized later on again.
That could certainly work. However I worry that this would make using scale augmentation difficult, since it means zooms would result in very large images that might OOM.
I'll also note that we should care about zooms much more than rotations, since using rotations in detection problems. Rotations will degrade the quality of the bounding boxes, whereas zooms won't.
On Jan 15, 2018 11:06 AM, "Hans Gaiser" notifications@github.com wrote:
What would you say about adjusting the canvas such that the entire image still fits? It would mean that all annotations remain valid, but also that rotations apply a sort of scaling since the transformed image is resized later on again.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/fizyr/keras-retinanet/issues/223#issuecomment-357724864, or mute the thread https://github.com/notifications/unsubscribe-auth/AJe4yuzbinkl77N0tYt05Hs7q_wZAeSTks5tK3d0gaJpZM4ReiAh .
right, but won't zoom be undone by the internal scaling anyway?
/edit: unless X and Y scaling is not the same, then it does do something I suppose.
The zoom should be applied such that it is not undone by internal scaling right? Otherwise it has no effect. In my testing I have found zooms more useful than either translations (no real effect) or rotations.
Oops, you're right. I didn't think that through. Since we're not modifying the canvas size right now, scaling does have an effect. If we would modify the canvas size, uniform scaling would become a NOP, and rotation/shearing would affect scaling. Hmm...
Since we're not modifying the canvas size right now, scaling does have an effect. If we would modify the canvas size, uniform scaling would become a NOP, and rotation/shearing would affect scaling. Hmm...Since we're not modifying the canvas size right now, scaling does have an effect. If we would modify the canvas size, uniform scaling would become a NOP, and rotation/shearing would affect scaling. Hmm...
I really don't think this is right.
I'm looking into this issue now, here are my suggestions.
the transform applied to the image by the augmentation should be an exact match to what is done in keras Image Preprocessing. This is the most intuitive behavior.
users can specify what is done to boxes transformed out of the canvas. Options should be:
(a) discard image with any box transformed out of canvas
(b) retain all boxes
(c) discard all boxes
(d) retain boxes with IoU>threshold, discard otherwise
(e) [optionally] add the ability to set boxes to dont care
Hmm...Since we're not modifying the canvas size right now, scaling does have an effect. If we would modify the canvas size, uniform scaling would become a NOP, and rotation/shearing would affect scaling. Hmm...
I really don't think this is right.
Yeah, I agree. Rotation/shearing shouldn't mess with scaling, and scaling shouldn't be a NOP.
I think the best solution is to let the user provide a callback which determines what to do with transformed annotations, and provide one or a few configurable callbacks in the library for the different options.
190 added a new random transform generator as replacement for random transformation applied by keras. One open question that remained was how to deal with annotations that are transformed so that they partially fall out of the image canvas. This can happen for translation, scaling, rotation and shearing (but not for flips).
In general I believe there are 3 strategies for dealing with this:
1) Throw away the annotation. This is what is currently happening automatically (I think). 2) Clip the annotations with the image canvas. 3) Keep retrying different transforms until no annotation is partially lost (completely lost should not be a problem).
A combination of these is also possible. For example: check the area of the clipped transformed bounding box compared to the original area, and drop it if falls below some threshold. Otherwise accepted the clipped box.
The most flexible solution would be to let the user provide a callback which determines wether to clip or drop an annotation. We could combine this with a callback to apply the transformation to the annotations, if we want to support other forms of annotations besides bounding boxes.