Closed Crazod closed 4 years ago
Yes, training is stable so far. In Eqn.12 we remove w^2+h^2 and alpha has no gradient backward.
Yes, training is stable so far. In Eqn.12 we remove w^2+h^2 and alpha has no gradient backward.
Hi, i have an question, paper say dominator w^2 and h^2 is usually a small value for the cases h and w ranging in [0, 1]. I wonder that the variable "w" is the normalize box width by image. Or the real width in image. i use the w is the width of the real bbox. Is it wrong?
x,y,w,h will normalize to [0,1] so as to balance with cls loss. And did you print 'w' on the terminal?
Oh Thank u. I think i know why. My w is a real variable, like w = 0 ~ 1080(my image width) 。But your w is the normalize variable in your code. https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/box/box_utils.py#L91。 So when i implentment your code in my project. The w is not a variable [0, 1] but a variable [0, max width] in origin image。Is it true? so i change the code:
with torch.no_grad():
arctan = torch.atan(w2 / h2) - torch.atan(w1 / h1)
v = (4 / (math.pi ** 2)) * torch.pow((torch.atan(w2 / h2) - torch.atan(w1 / h1)), 2)
S = 1 - iou
alpha = v / (S + v)
w_temp = 2 * w1
distance = w1 ** 2 + h1 **2
ar = (8 / (math.pi ** 2)) * arctan * (w1 - w_temp) * h1
cious = iou - (u + alpha * ar / distance)
And now it can converge in a small dataset. I need time to read the whole project. And i will test coco tonight. By the way, I'm working in SenseTime in Shanghai. If you would like find for a Intern Job in Beijing or Shanghai. Contact with me anytime. Wangliming@sensetime.com
haha, thank u in advance. I have one and a half years to graduate.
Oh Thank u. I think i know why. My w is a real variable, like w = 0 ~ 1080(my image width) 。But your w is the normalize variable in your code. https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/box/box_utils.py#L91。 So when i implentment your code in my project. The w is not a variable [0, 1] but a variable [0, max width] in origin image。Is it true? so i change the code:
with torch.no_grad(): arctan = torch.atan(w2 / h2) - torch.atan(w1 / h1) v = (4 / (math.pi ** 2)) * torch.pow((torch.atan(w2 / h2) - torch.atan(w1 / h1)), 2) S = 1 - iou alpha = v / (S + v) w_temp = 2 * w1 distance = w1 ** 2 + h1 **2 ar = (8 / (math.pi ** 2)) * arctan * (w1 - w_temp) * h1 cious = iou - (u + alpha * ar / distance)
And now it can converge in a small dataset. I need time to read the whole project. And i will test coco tonight.
thanks for you share, why use the this distance : distance = w1 ** 2 + h1 **2
? normalize w1*h1 ?
No, we did not. On the contrary, we removed w²+h².
In Zzh's code,The original code removed w^2 + h^2 because the use the normalize distance. The range of w is (0, 1). But I just want to use the diou loss in my own project. All my distance in my own project is the real distance like w is the range of (0, width). So I normaize the w and h by a distance. I think the distance is not very important just a real number for avoiding gradint expolision.
@Crazod @Zzh-tju thanks , i got it . but i use diou or ciou loss in my code , it not better than giou loss in mscoco. maybe something i get miss. so i want to run you source code, but i get this error :
File "tools/train.py", line 287, in <module> train() File "tools/train.py", line 188, in train loss.backward() File "/opt/conda/lib/python3.6/site-packages/torch/tensor.py", line 102, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/opt/conda/lib/python3.6/site-packages/torch/autograd/__init__.py", line 90, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
maybe environment problem I'm not sure
@ranjiewwen I got the same result.Giou's result is better than Ciou. have you try Diou?
I've tried Diou. It works very well ! The result is much more Better than ciou and giou!
@psyanglll i am not get better result, have you run the source code , meet this problem RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation
?
@ranjiewwen I just used the Diou in my own SSD .And got the better result than giou and ciou. As for your problem. I guess it's wrong version of Pytorch.
@ranjiewwen I just used the Diou in my own SSD .And got the better result than giou and ciou. As for your problem. I guess it's wrong version of Pytorch.
thanks, i use pytorch0.4.1 the problem solved.
@psyanglll How much performance has been improved when Diou in your own SSD? In my SSD with Diou,the performance descend ,my pytorch is 0.3.Thank you!
Oh Thank u. I think i know why. My w is a real variable, like w = 0 ~ 1080(my image width) 。But your w is the normalize variable in your code. https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/box/box_utils.py#L91。 So when i implentment your code in my project. The w is not a variable [0, 1] but a variable [0, max width] in origin image。Is it true? so i change the code:
with torch.no_grad(): arctan = torch.atan(w2 / h2) - torch.atan(w1 / h1) v = (4 / (math.pi ** 2)) * torch.pow((torch.atan(w2 / h2) - torch.atan(w1 / h1)), 2) S = 1 - iou alpha = v / (S + v) w_temp = 2 * w1 distance = w1 ** 2 + h1 **2 ar = (8 / (math.pi ** 2)) * arctan * (w1 - w_temp) * h1 cious = iou - (u + alpha * ar / distance)
And now it can converge in a small dataset. I need time to read the whole project. And i will test coco tonight. By the way, I'm working in SenseTime in Shanghai. If you would like find for a Intern Job in Beijing or Shanghai. Contact with me anytime. Wangliming@sensetime.com
I also work in real variable and change the code as you post , I then train on coco dataset, found the cious loss decrease extremely slow
@Zzh-tju I have tried the diou/ciou loss on Keras, with VGG16 pre-trained and Pascal VOC dataset. When using smooth L1 looks fine, but if I try to switch to diou/ciou, the loss shall grow rapidly and soon the training terminates.
Here is my code
def ciou_loss(self, y_true, y_pred): """ Implements the GIoU loss function. CIoU is an enhancement for models which use IoU in object detection. Args: y_true: true targets tensor. The coordinates of the each bounding box in boxes are encoded as [y_min, x_min, y_max, x_max]. y_pred: predictions tensor. The coordinates of the each bounding box in boxes are encoded as [y_min, x_min, y_max, x_max]. mode: one of ['giou', 'iou'], decided to calculate GIoU or IoU loss. Returns: GIoU loss float
Tensor`.
"""
y_pred = tf.convert_to_tensor(y_pred)
if not y_pred.dtype.is_floating:
y_pred = tf.cast(y_pred, tf.float32)
y_true = tf.cast(y_true, y_pred.dtype)
ciou = tf.squeeze(self.calculate_ciou(y_pred, y_true))
return 1 - ciou
def calculate_ciou(self, b1, b2):
Arguments:
target (nD tensor): A TensorFlow tensor of any shape containing the ground truth data.
In this context, the expected tensor has shape `(batch_size, #boxes, 4)` and
contains the ground truth bounding box coordinates, where the last dimension
contains `(xmin, xmax, ymin, ymax)`.
output (nD tensor): A TensorFlow tensor of identical structure to `y_true` containing
the predicted data, in this context the predicted bounding box coordinates.
takes in a list of bounding boxes
but can work for a single bounding box too
all the boundary cases such as bounding boxes of size 0 are handled.
###
zero = tf.convert_to_tensor(0.0, b1.dtype)
x1g, y1g, x2g, y2g = tf.unstack(value=b1, num=4, axis=-1)
x1, y1, x2, y2 = tf.unstack(value=b2, num=4, axis=-1)
true_width = tf.maximum(zero, x2g - x1g)
true_height = tf.maximum(zero, y2g - y1g)
pred_width = tf.maximum(zero, x2 - x1)
pred_height = tf.maximum(zero, y2 - y1)
true_area = true_width * true_height
pred_area = pred_width * pred_height
###iou term###
intersect_ymin = tf.maximum(y1g, y1)
intersect_xmin = tf.maximum(x1g, x1)
intersect_ymax = tf.minimum(y2g, y2)
intersect_xmax = tf.minimum(x2g, x2)
intersect_width = tf.maximum(zero, intersect_xmax - intersect_xmin)
intersect_height = tf.maximum(zero, intersect_ymax - intersect_ymin)
intersect_area = intersect_width * intersect_height
union_area = true_area + pred_area - intersect_area
iou = tf.math.divide_no_nan(intersect_area, union_area)
###
###distance term###
x_center = (x2 + x1) / 2
y_center = (y2 + y1) / 2
x_center_g = (x1g + x2g) / 2
y_center_g = (y1g + y2g) / 2
xc1 = tf.minimum(x1, x1g)
yc1 = tf.minimum(y1, y1g)
xc2 = tf.maximum(x2, x2g)
yc2 = tf.maximum(y2, y2g)
c = tf.pow((xc2 - xc1), 2) + tf.pow((yc2 - yc1), 2)
d = tf.pow((x_center - x_center_g), 2) + tf.pow((y_center - y_center_g), 2)
u = tf.math.divide_no_nan(d, c)
###
###aspect-ratio term###
arctan = tf.atan(tf.math.divide_no_nan(true_width, true_height)) - tf.atan(tf.math.divide_no_nan(pred_width, pred_height))
v = (4 / tf.pow(math.pi, 2)) * tf.pow((tf.atan(tf.math.divide_no_nan(true_width, true_height)) -
tf.atan(tf.math.divide_no_nan(pred_width, pred_height))), 2)
s = 1 - iou
alpha = tf.math.divide_no_nan(v, s + v)
w_temp = 2 * pred_width
# ar = (4 / (tf.pow(math.pi, 2))) * tf.pow(arctan, 2)
###
# ar = tf.clip_by_value(alpha * v, -1.0, 1.0)
###calculate ciou###
ciou = iou - (u + alpha * v)
#ciou = tf.clip_by_value(ciou, -1.0, 1.0) #If I add this line, the loss shall be stable, but result is not good
return ciou `
I looked at your codes, seems you add the clamp and restrict the value of ciou between -1 and 1. have you encountered the similar things?
hi, I'm confused about the mention in your paper that "removed w²+h²", where can i find it in the code? it seems that the implement just make "alpha" constant, but the v variable still will be differentiated with w²+h²?
HI, zhao hui: I try DIOU loss in Retinanet and it can converge well. But when i try CIOU loss in Retinanet, it can not converge stable. I use your code and have an question ?The equation (12) in your paper. You have replace many variable,is it work alwayes stable?