wjf5203 / VNext

Next-generation Video instance recognition framework on top of Detectron2 which supports InstMove (CVPR 2023), SeqFormer(ECCV Oral), and IDOL(ECCV Oral))
Apache License 2.0
603 stars 53 forks source link

How to padding the non-bbox? #33

Open CZY-Code opened 2 years ago

CZY-Code commented 2 years ago

the number of instances must be the same in one video? if not how to padding the non-bbox? `targets_for_clip_prediction.append({"labels": torch.stack(clip_classes,dim=0).max(0)[0], "boxes": torch.stack(clip_boxes,dim=1), # [num_inst,num_frame,4] 'masks': torch.stack(clip_masks,dim=1), # [num_inst,num_frame,H,W] 'size': torch.as_tensor([h, w], dtype=torch.long, device=self.device),

'inst_id':inst_ids,

                            # 'valid':valid_id
                            })`