Open lufanma opened 3 months ago
Also, the above setting support single batchsize training on V100 at most.
@lufanma Thank you for your attention ! This metric looks strange, after reducing the image input in the experiment you have the option of modifying final_dim or reducing the self.dtransform by one layer of downsampling in cam_stream_lss.py to ensure that the feature sizes are aligned with the depth map sizes.
@lufanma Thank you for your attention ! This metric looks strange, after reducing the image input in the experiment you have the option of modifying final_dim or reducing the self.dtransform by one layer of downsampling in cam_stream_lss.py to ensure that the feature sizes are aligned with the depth map sizes.
Wow, thanks for quick reply!!! Since the camera intrinsic used in the img2lidar transform is exactly corresponds to the original image coordinates, so I think that the depth map size should be the same as that of original RGB size. Otherwise, the camera intrinsic does not correspond to the image coordinate.
So I keep the final_dim (1600, 896) and self.dtransform unchanged, since fpnc will fuse different feature scales to the 8 downsample of final_dim. In this way, the feature sizes are aligned with the depth map sizes after self.dtransform (8 downsample).
@hht1996ok Is my understanding above correct?
I change the related codes in ealss.py and ealss_cam.py, like this: `# original
# generate original img size corresponding depth map
ogfH, ogfW = self.final_dim
depth_H, depth_W = ogfH - 4, ogfW # H=896, W=1600
depth = torch.zeros(batch_size, img_size[1], 1, depth_H, depth_W).cuda() # 创建大小 # (B, 6, 1, 896, 1600)`
and the whether projected in image clip: `# original
# (cur_coords[..., 0] < img_size[3])
# & (cur_coords[..., 0] >= 0)
# & (cur_coords[..., 1] < img_size[4])
# & (cur_coords[..., 1] >= 0)
# ) # (6, N) 0/1 matrix
# cur_coords shape is (6, N, 2) yx
on_img = (
(cur_coords[..., 0] < depth_H)
& (cur_coords[..., 0] >= 0)
& (cur_coords[..., 1] < depth_W)
& (cur_coords[..., 1] >= 0)
) # (6, N) 0/1 matrix`
@hht1996ok Hi~ haotian, could u please answer my question? ths
@lufanma This looks right, you should keep the depth map dimensions aligned with the feature dimensions.
@hht1996ok I have problems.
load_img_from =
the weights i trained from bevfusion(cam) the results are bad loss mantain about 9, and i use the mask~.pth the loss was about 4 and it seems not to decrease . i have tried make the final_dim = 800x448 img_scale=448x800, keep the downsample unchanged , and self.dransform unchanged ,but the loss is so high .
我没有跑过他实验的都看出你的问题,heatmap第三轮损失太大了,就算是纯图像的都没有你这么大
你用的是bev pool v1还是v2,如果是v2的话,速度是不是很慢
你用的是bev pool v1还是v2,如果是v2的话,速度是不是很慢
用的是 这个作者用的bevpool应该是v1,我根据bevfusion的config img_scale 448x800 更改了 depth的大小 以及upsample dtransform目前情况是这样的
@lufanma 请问你得到的mAP是多少呢,我修改了img_scale尺寸为800448,然后更改了相应的depth也为800448,跑出来之后mAP只有0.3199
兄弟你有卡吗?有卡的话,我带你冲cvpr25,刷一篇时序的,我现在单帧图像+点云融合 test集刷到了73.9%mAP,准备投aaai24,在我这篇的基础上弄一个时序的出来,提供卡就可以挂名。怎么样。这个点数和sparseLIF只差0.5%mAP,他的图像是21帧,我的图像仅仅一帧,有很大概率能刷赢他的。
哎,时序的估计要大显存才能玩,点子我都想好了,就是没有卡来做实验。
是啊,我连这个EALSS我都复现不了,48g不够用
48G够了啊,我这个aaai24用4090刷的,下篇准备用slowfast的套路刷
是啊,我连这个EALSS我都复现不了,48g不够用,中了就公布代码,看我的效果
@hht1996ok Thanks for very sota work and code publication !!!
In the original setting of camera stream, input image size (exactly img_scale) is (1600, 896).
I conducted experiments on V100*4,original input setting will cause cuda out of memory, so I reduce the input size to half of that (exactly 800x448) and also update the related codes that generate multi-view depth map from lidar pc projection in _ealss.py and ealss_cam.py extractfeat func. Still generate the original RGB img size depth map of (1600, 896) scale.
After 20 epochs full training, the evaluation metrics are very low (mAP 11.55, NDS 17.43 only), seems abnormal. I guess it's cause from the low-resolution feature map (exactly C4 level feature of original size) are used, instead of the original setting (exactly C2 level feature of original size).
@hht1996ok Could u please give me some instructions about the above question, thanks very very much !!! Looking forward to ur reply.