[DynamicHead - Microsoft] Implementation request

lsch0lz commented 7 months ago

Model description

Hi there! I was wondering if anyone has tried to implement Microsoft's DyHead? If not, I would like to contribute the implementation by adding a new model to the library. Is there any interest in a PR for this model?

Here a short explanation from their official implementation: "In this paper, we present a novel dynamic head framework to unify object detection heads with attentions. By coherently combining multiple self-attention mechanisms between feature levels for scale-awareness, among spatial locations for spatial-awareness, and within output channels for task-awareness, the proposed approach significantly improves the representation ability of object detection heads without any computational overhead."

Open source status

[X] The model implementation is available
[X] The model weights are available

Provide useful links for the implementation

GitHub Repository: Link Paper: Link

NielsRogge commented 7 months ago

That would be really cool! However, I see they integrated DyHead only into frameworks such as Faster R-CNN and ATSS. Would that imply that you would also implement one of these frameworks?

lsch0lz commented 7 months ago

Oh shoot! I wasn't checking whether all the parts I needed were already in the transformers library, I just wanted to add a new model. Is there some sort of list of models you want to implement where all components are already there?

NielsRogge commented 7 months ago

Currently the only object detection model being added is RT-DETR at #29077. I think it'd be great to support more recent object detection models which top the rankings at https://paperswithcode.com/sota/object-detection-on-coco (although those are usually high performing but slower). cc also @qubvel

huggingface / transformers