tgxs002 / CORA

A DETR-style framework for open-vocabulary detection (OVD). CVPR 2023
Apache License 2.0
166 stars 14 forks source link

Expected date of releasing CLIP-Aligned labeling code #20

Closed zifuwan closed 6 months ago

zifuwan commented 11 months ago

Dear authors,

Thanks for the great job. I would like to ask if you have a plan to release the code of CLIP-Aligned Labeling so that we can reproduce the relabeled annotations. I've read your paper but couldn't find the specific way you deal with relabeling. Do you relabel the annotations with an un-trained region classifier at first?

Thank you for your time to my question.

theneotopia commented 6 months ago

I've implemented by my own, it seems that you can simply run the inference code and assign each ground truth box with the largest probability produced by the Region Prompted Region Classifier as its CLIP-Aligned label.

theneotopia commented 6 months ago
box_ids = torch.cat([target['box_ids'] for target in targets])
ori_labels = torch.cat([target['labels'] for target in targets])
relabeled_labels = logits.max(dim=-1)[1]
for box_id, ori_label, relabeled_label in zip(box_ids, ori_labels, relabeled_labels):
    relabel_dict[box_id.item()] = {
        'ori_label': data_loader.dataset.label2catid[ori_label.item()],
        'relabeled_label': data_loader.dataset.label2catid[relabeled_label.item()],
    }
correct_cnt += (ori_labels == relabeled_labels).sum().item()
total_cnt += len(ori_labels)