Closed shawrby closed 2 years ago
Great question.
A/B (in function names) are arbitrary coordinate systems -- they are just placeholders.
ref = reference camera. At test time, this is facing forward.
mem = "memory" coordinates == model coordinates == voxel coordinates
cam = camera = 3d coordinate system
pix = pixels = projected camera coordinates (2d)
Feel free to ask more, to help me answer your precise question.
@aharley
I have a additional question.
The main parameters of the function in the picture below are,
Shouldn't a utils.geom.apply_4x4 function take an argument mem_T_ref?
No, the notation goes the other way. The way to read it is: ref_T_mem transports mem
points into ref
coordinates.
The visual shortcut is: ref_T_mem * xyz_mem
is a valid matmul, because the mem
coords are adjacent.
(This convention lets us easily keep track of valid transformations, such as point_a = a_T_b * b_T_c * c_T_d * point_d
.)
Thanks for the clear explanation!
@aharley
What exactly do cam0 and camXs mean?
cam0 is the camera being currently used as "reference" -- this dictates the orientation of the 3D/BEV tensors, which accordingly live in mem0 (i.e., ref2mem applied on cam0 things). At test time, cam0 is the forward-facing camera, and at training time this is a random camera. camXs is all other cameras.
@aharley
Thanks for the kind explanation !
@aharley
Sorry for the many questions..
I don't understand the get_occupancy function to get the occupancy map from radar data.. Could you recommend reference material or paper for understanding?
The process there is just to find out which voxels have a point inside, and then set the value to 1 for those voxels. This type of function is pretty common so you can maybe google "convert point cloud to occupancy grid" to see lots of answers.
Hi there,
Thank you for your research.
Ref, Center, Mem, camA, camB, pix B coordinate in vox_util aren't well understood.. Could you explain about it easily?