nutonomy / nuscenes-devkit

The devkit of the nuScenes dataset.
https://www.nuScenes.org
Other
2.25k stars 624 forks source link

Matching YoLo detections to Nuscenes ground truth detection boxes #677

Closed abadithela closed 2 years ago

abadithela commented 2 years ago

Hi,

I don't actively work in computer vision, so the following question might be a trivial one. I am trying to apply YoLo for object detection to the nuscenes dataset on the vision data only. YoLo returns pixel coordinates of the objects detected in the image. How do I convert this pixel output into the camera coordinate frame consistent with nuscenes? After that, I need to go from the camera coordinate frame to the global coordinate frame, and I was hoping to modify box_to_sensor to do that. https://github.com/nutonomy/nuscenes-devkit/blob/864d0a207539e5383cd3eb26ebb1d7a44622f09d/python-sdk/nuscenes/eval/common/utils.py#L130

holger-motional commented 2 years ago

Hi. If your goal is to go from Yolo's 2d boxes to nuScenes 3d boxes, that is strictly speaking not possible. For each 2d box there are infinitely many possible 3d boxes. That said, there are all kinds of tricks you could use.

abadithela commented 2 years ago

Hi Holger:

I'd like to use the second option.

Aside fron that, I'm using a pre-trained YoLo model (not trained on the nuscenes dataset), to detect cars and pedestrians. YoLo returns 2d bounding boxes in pixels and I'm trying to match that with the ground truth pixel values of bounding boxes from nuscenes. Basically, I'm trying to find the precision and recall (and other classification metrics) of the YoLo algorithm on the nuscenes dataset. I managed to get both in pixel coordinates on the same image size; however, obviously, these ground truth and prediction pixel boxes do not overlap perfectly. Further, it seems like some ground truth bounding boxes of nuscenes miss the object.

Thanks, Apurva


From: Holger Caesar @.> Sent: Monday, November 1, 2021 8:32 PM To: nutonomy/nuscenes-devkit @.> Cc: Badithela, Apurva @.>; Author @.> Subject: Re: [nutonomy/nuscenes-devkit] Matching YoLo detections to Nuscenes ground truth detection boxes (Issue #677)

Hi. If your goal is to go from Yolo's 2d boxes to nuScenes 3d boxes, that is strictly speaking not possible. For each 2d box there are infinitely many possible 3d boxes. That said, there are all kinds of tricks you could use.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/nutonomy/nuscenes-devkit/issues/677#issuecomment-957070369, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ADGQTPWIURDYS3CU7HOWD7LUJ5LV5ANCNFSM5HES7VKA. Triage notifications on the go with GitHub Mobile for iOShttps://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Androidhttps://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

holger-motional commented 2 years ago

I don't think it makes sense for you to lift 2d to 3d. I suggest the following:

abadithela commented 2 years ago

Got it, thanks. I ended up using functions from export_2d_annotations_as_json.py.