city-super / MatrixCity

Apache License 2.0
217 stars 9 forks source link

Some questions about street dataset: cubemap orders of street images in small_city, differences of "transforms.json" and "transforms_origin.json" and other questions #7

Closed thucz closed 11 months ago

thucz commented 11 months ago

Sorry to bother you. MatrixCity is a great dataset with high quality! I have some questions.

  1. In your paper, you said:

    We position six perspective cameras at each capturing location to render a cube map, providing a comprehensive view of the surroundings. Note that the cube map can be naturally transformed into panorama images, which are suitable for capturing the street views as much as possible with limited camera positions.

But I can not tell the orders of cube-maps at the same position. Could you tell me which face is FRONT( or BACK, TOP, DOWN, LEFT, RIGHT)?

  1. What‘s difference between "transforms.json" and "transforms_origin.json"? is the "rot_mat/transform_matrix" in the json file camera-to-world matrix?
yixuanli98 commented 11 months ago
  1. It can not be told. For each street, we manually annotate the start and end position, and sample uniformly. In our plugin, we will collect one street with the same direction at one time and change the direction to collect the street repeatedly until covering all direction. While different streets have different lengths, we can not simply tell you the direction of every photos. We will release the start and end position of each street soon. You can infer the TOP and DOWN part. But we do not strictly keep the camera direction starting from "FRONT", and sometimes it starts from "BACK". So we can not accurately provide the FRONT, BACK, LEFT, RIGHT information.
yixuanli98 commented 11 months ago
  1. "transforms_origin.json" is the origin complete collection trajectories. For aerial data, there are many redundant views looking outside of the map with no useful information and we delete these views. For street data, because the DOWN view only contains street and these views are deleted in real-world collection like nerfstudio. So we delete the DOWN views. the "transforms.json" is the data used in our training and evaluation. The split tested in our paper is in the directory street/pose.
yixuanli98 commented 11 months ago
  1. "rot_mat/transform_matrix" in json file is camera-to-world matrix but it needs some modification in order to be used by nerf. The 33 matrix should multiplied by 100 and we scale the 31 position vector by dividing 100. The pose in street/pose are already be modified and can be used directly.
thucz commented 11 months ago

Thanks for your reply sincerely.