Closed gottang closed 1 week ago
Location in document: S3.SS1.p1.2
Selected HTML: learned embeddings called object queries to generate object embeddings, which are then processed by feed-forward prediction networks (FFPNs) to generate class probabilities, 2D bounding box, and IBB keypoints in parallel. The IBB keypoints are then processed by a subsequent FFPN to estimate the translation and rotation parameters. Since the cardinality of the predicted set is fixed, the model is trained to predict Ø classes after detecting all the target objects present in the image. By associating predictions and ground truth objects using a bipartite matching algorithm [46], YOLOPose is trained end-to-end.
Hello @gottang, thanks for the issue report! We are reviewing your report and will address it as soon as possible.
Description
无法翻译
(Optional:) Please add any files, screenshots, or other information here.
No response
(Required) What is this issue most closely related to? Select one.
Choose One
Internal issue ID
69cd0365-c3d7-4d43-894d-fcbce5a88341
Paper URL
https://arxiv.org/html/2403.09309?_immersive_translate_auto_translate=1
Browser
Chrome/99.0.4844.84
Device Type
Desktop