Open mali-afridi opened 3 months ago
Hey, I was wondering how would we support pose estimation for RT-DETR? The output features are 256 from HybridEncoder. Which heads for pose estimation should we use? Would a headmap based approach work?
Hey, I was wondering how would we support pose estimation for RT-DETR? The output features are 256 from HybridEncoder. Which heads for pose estimation should we use? Would a headmap based approach work?