ChenhongyiYang / PPAL

[CVPR 2024] Plug and Play Active Learning for Object Detection
Apache License 2.0
68 stars 8 forks source link

Issue about Category Conditioned Matching Similarity(CCMS) #6

Closed kdh-awraw1019 closed 1 year ago

kdh-awraw1019 commented 1 year ago

Hi, I'm trying on connecting to ppal with YOLOv6 model.

It proceeded until the previous stage of CCMS, but it is difficult to proceed afterwards.

Retinanet, which is used as a reference model in ppal, has the same number of channels in feature pyramids, but the YOLOv6 model does not.

For a simple example,

Retinanet multi level features are (cls_feat)

(Pdb) mlvl_feats[0].shape = torch.Size([256, 116, 76]) (Pdb) mlvl_feats[1].shape = torch.Size([256, 58, 38]) (Pdb) mlvl_feats[2].shape = torch.Size([256, 29, 19]) (Pdb) mlvl_feats[3].shape = torch.Size([256, 15, 10]) (Pdb) mlvl_feats[4].shape = torch.Size([256, 8, 5])

But, YOLOv6s multi level features are (cls_feat)

torch.Size([1, 64, 56, 80]) torch.Size([1, 128, 28, 40]) torch.Size([1, 256, 14, 20])

So, function 'det_feats = get_inter_feats(mlvl_feats, det_lvl_inds, det_unscale_bboxes, img_shape)' is not working (in al_retinanet_feat_head.py)

And in your paper, In the case of SSD, it is said that kl-divergence can be used, but before that, the calculation of the 'get_inter_feats' function is required.

How can I solve it?

ChenhongyiYang commented 1 year ago

Hi,

Thank you for your persistent interest in our work :-)

I think applying PPAL to YOLO 6 can be a little bit tricky. As you have mentioned, in YOLO 6 the head weights are not shared across different FPN layers, making different layers have different channel numbers. Also, in this case the features will not be in a unified space. This will hinder the PPAL algorithm from extracting regional features for CCMS computing. For SSD, which we will open-source later, we use KL divergence to circumvent this difficulty where the regional features are simply the classification softmax vector, so all layers will have the same channel number (the number of classification ways). However, YOLO 6 use binary cross-entropy for classification, which also prevents us from using KL divergence.

I am sorry that I cannot get a good answer to your question about applying PPAL to YOLO 6; I think you can make some small modifications to the model to make PPAL usable. For example, you can add a layer to map each feature maps to the same number and use a shared-weight logit layer for classification. For example, for the 3 layers that have 80, 40 and 20 channels, we can add a simple 1x1 Conv2d to make all of them have 40 channels. Then you can use a shard Conv2d(40, C) layer to compute the final classification logits. In this case, the regional features can be extracted on the unified 40-channel feature maps.

Best, Chenhongyi

kdh-awraw1019 commented 1 year ago

Thank you for your advice.

After reading your answer, I realized that I had to modify YOLOv6.

I will consider your advice and look for other ways.

viralrupapara36 commented 10 months ago

Hello @kdh-awraw1019, I'm also applying ppal to yolo and initially I've applied in yolov3 and 90% completed only facing one error while computing distances of images (image_dis.npy file) and it's because of the shape of cle_feat as you mentioned in the question. Can you please guide me for this...