Open stanny880913 opened 11 months ago
You can access the ouput of radar backbone from the variable named x_other. See https://github.com/longyunf/radiant/blob/cf5355396d42ef17940e29ef8f9e3cabfd8035c3/lib/my_model/radiant_fcos3d_network.py#L267
Yes, functions freeze_subnet and freeze_cam_heads prevent changes in monocular weights. See https://github.com/longyunf/radiant/blob/cf5355396d42ef17940e29ef8f9e3cabfd8035c3/scripts/train_radiant_fcos3d.py#L137C1-L138C28
You can access the ouput of radar backbone from the variable named x_other. See
Yes, functions freeze_subnet and freeze_cam_heads prevent changes in monocular weights. See https://github.com/longyunf/radiant/blob/cf5355396d42ef17940e29ef8f9e3cabfd8035c3/scripts/train_radiant_fcos3d.py#L137C1-L138C28
May I ask where did you call extract_feat func and concat image and radar like the image below, the part of concatenate thank you
OK, thanks! In addition, I would like to ask you about the training of radar branch and how to obtain the ground truth for calculating the offset of the center of the object? I don’t quite understand how to do it in the paper! Thanks
and I annotate functions freeze_subnet
and freeze_cam_heads
, so the model in camera branch will change? will this change affect the radar brancg training or the results od detection?
Thank you
We associate radar points with GT boxes and compute 2D offset from radar points to corresponding GT centers on image as well as depth offsets. (see https://github.com/longyunf/radiant/blob/cf5355396d42ef17940e29ef8f9e3cabfd8035c3/lib/my_model/radiant_fcos3d_network.py#L2414)
Yes, the camera weights will change and may not preserve optimal monocular detection performance if you do not freeze them.
We associate radar points with GT boxes and compute 2D offset from radar points to corresponding GT centers on image as well as depth offsets. (see
) Yes, the camera weights will change and may not preserve optimal monocular detection performance if you do not freeze them.
Hello, thank you very much for your response. I also want to ask:
- I do not have the assumption. The radar points are typically not at object centers.
- For radar inputs, see https://github.com/longyunf/radiant/blob/cf5355396d42ef17940e29ef8f9e3cabfd8035c3/lib/my_pipelines.py#L197
- No.
- Use GT depths of object
Outputs can be visualized from bounding boxes (camera branch) or predicted offsets to object centers from radar points (radar branch): camera branch outputs: https://github.com/longyunf/radiant/blob/cf5355396d42ef17940e29ef8f9e3cabfd8035c3/lib/my_model/radiant_fcos3d_network.py#L959
radar branch output: https://github.com/longyunf/radiant/blob/cf5355396d42ef17940e29ef8f9e3cabfd8035c3/lib/my_model/radiant_fcos3d_network.py#L1002
Thanks for your reply!
print(radar_map.shape)
it show [1,10,928,1600], it's mean 1 batch_size, 10 channel,img_weight=928 and img_height=1600?I found that Resnet18 input need to be [3,224,224]?
- The radar points can be seen as object candidates, i.e. object locations with initial object center predictions but inaccurate. The task of the model is to refine the initial predictions based on neighboring radar/cam information.
- The input of convolutional networks is not necessarily fixed. You can run resnet (https://github.com/longyunf/radiant/blob/cf5355396d42ef17940e29ef8f9e3cabfd8035c3/lib/my_model/resnet.py#L297 ) with different input sizes, although relative resolution between input and output may be fixed due to constant down-sampling.
- DWN uses some raw radar information directly from radar measurement such as Doppler velocity.
- DWN compares object depth estimation from camera head and from radar head.
Thank you very much for your answer and would also like to ask:
- Original radar measurement may offer some information on the confidence of radar head output, e.g. higher RCS may indicate stronger radar signal and higher confidence.
I'm really sorry to bother you with so many questions.
Hello, I would like to ask according to your code, where can I get the feature map output of the radar backbone? And if I annotate freeze, does it mean that the camera branch and radar branch are trained at the same time? Thanks