Closed Castiel-Lee closed 1 year ago
Thanks for reporting this error. I think the position is correct. I generate the index in the 2D matrix in this way: https://github.com/TuSimple/centerformer/blob/5a949b88ed7bb15aafb39bf78c95f1452063ebea/det3d/models/necks/rpn_transformer.py#L247-L250
Here neighbor_coords[:, :, :, 1]
comes from the y_coord
and neighbor_coords[:, :, :, 0]
comes from the x_coord
.
However, the bug is probably this: https://github.com/TuSimple/centerformer/blob/5a949b88ed7bb15aafb39bf78c95f1452063ebea/det3d/models/necks/rpn_transformer.py#L244-L246
The coordinates should be clamped separately using their own size rather than just assuming H = W
.
I don't have a machine at hand to test. Can you check if changing this can fix your error?
Hello,
I notice that, the bug reporting can disappear through _xcoor and _ycoor being clamped within [0, H] and [0, W]. But it could probably leave the fact behind that x_coor = order // W and y_coor = order % W
. I did a small test:
According to the picture, the revised version seems correctly make x_coor = 7 and y_coor =4.
Could you think this over and recheck the implementation when you are available? Of course, if "transposing" or some other purposes exist, I would apologize for my misunderstanding and bothering you.
Oh, I see. So the x_coor
and y_coor
here do not mean the row and col indexes of this value in the 2D matrix (like [7,4]). Conversely, it means the indexes on the height dimension (y_coor
) and width dimension (x_coor
), which are common to describe the coordinates of a pixel in an image.
I will recheck the code just in case there are some other bugs.
Oh, I see. So the
x_coor
andy_coor
here do not mean the row and col indexes of this value in the 2D matrix (like [7,4]). Conversely, it means the indexes on the height dimension (y_coor
) and width dimension (x_coor
), which are common to describe the coordinates of a pixel in an image.I will recheck the code just in case there are some other bugs.
Since height dimension (y_coor
) and width dimension (x_coor
), in Line 469, it should be torch.stack([y_coor, x_coor], dim=2)
, right?
The order of the x_coor
and y_coor
variables does not matter in this case, as long as you remember the correct dimension that each variable represents. The deformable attention layer also requires storing the reference point position in the same way.
Oh, I see. I will go through it again and recheck it. Thank you so much for the clarification.
Great! If you have any further questions or concerns, feel free to reopen the issue.
Those two lines seem flawless. But coincidently, I changed the range of PC, which made H unequal to W. I encountered this: Afte debugging, in
get_multi_scale_feature
, _centerpos sometimes fell out of the range of feat. After checking backwards, I found some elements of _ycoor > W exist. in Line 468.After some experiments I tried, this problem can be fixed by swaping _xcoord and _ycoord.