princeton-computational-imaging / SeeingThroughFog

MIT License
298 stars 59 forks source link

Gated to RGB homography #38

Closed connorlee77 closed 3 years ago

connorlee77 commented 3 years ago

In the supplementary material, it says that a homography map between gated and RGB was calculated to warp gated into the RGB plane. Is this transformation matrix provided anywhere?

It also says that RGB was cropped into gated FOV. Was there any resizing done at all prior to cropping or was it just a simple crop?

MarioBijelic commented 3 years ago

Hej @connorlee77,

I added a tool projecting the gated images into the RGB camera frame. Looking forward to your feedback.

connorlee77 commented 3 years ago

@MarioBijelic Sweet, thanks for this update! It helps a lot! I'm looking to just warp the gated to RGB using a constant homography so either through the functions process_image or process_images. I got three questions:

  1. In these functions, it appears that you guys crop the RGB image size (202:970, 280:1560) and resize to 1280x768. Is there a reason why the gated image is padded with 30 pixels on the right?

  2. Can you explain how the keypoints in corresponding_points.txt were marked? I saw that there's an offset of 768, which to me indicates that the sample RGB and gated image were aligned vertically. Were both images of size 1280x768 during this manual feature matching step?

  3. I'm also looking to transform the gated labels in gated_labels_TMP to the RGB coordinate frame. My instinct is to transform the gated bbox coordinates to RGB using the homography, and enforcing the result to be rectangular by a combination of min() and max(). Is that what you guys did? I'm also wondering if the labels in cam_left_labels_TMP and gated_labels_TMP are duplicates of each other, but in their respective coordinate frames (with visibility indicated by last 3 boolean values), or if there are some bboxes that exist in gated_labels_TMP that don't exist in cam_left_labels_TMP?

MarioBijelic commented 3 years ago

Dear @connorlee77,

  1. No, that's an old artefact which survived refactoring.
  2. Both images were cropped to a similar FOV and placed next to each other, therefore the second image has an offset of 768.
  3. Yes that is something we did as well for objects without 3D boxes. For closer objects we used the available 3D bounding boxes. Further objects, which were not visible in other cameras were additionally labeled by a human annotator. In our latest work https://arxiv.org/abs/2102.03602 we refined the 2D bounding boxes for gated_labels_TMP, by reannotating each object individually.