IDEA-Research / detrex

detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
https://detrex.readthedocs.io/en/latest/
Apache License 2.0
1.9k stars 199 forks source link

Two questions about denoising design, which might effect the performace #343

Open DeclK opened 4 months ago

DeclK commented 4 months ago

Hi, I love the detrex project, such a great work! I am reading the DINO codes, and I have 2 questions about the denoising design:

  1. When making the denoising query, it will pad some zero to make it a batch. But in the calculation, it seems the padded zero will be calculated in attention, because the attention mask does not consider the zero padding. Since the performance of DINO and DN-DETR are great, so it makes me think, how much would this design affect the network. Have you guys tried to consider this masking?
  2. When doing contrastive denoising, the negative noise would be large, according to the code, there might be a change that there might be invalid negative boxes, because the w & h could be less than 0, is this also part of the plan? https://github.com/IDEA-Research/detrex/blob/67f703b4afcdd448918dc019f101cb5a4f737205/projects/dino/modeling/dino.py#L424-L433