city-super / MatrixCity

Apache License 2.0
217 stars 9 forks source link

axis and directions of transformations? #21

Closed AIBluefisher closed 9 months ago

AIBluefisher commented 9 months ago

Hi,

May I know the axis (e.g. down-right-backwards) and the direction (from camera to world or world to camera) of the camera poses?

yixuanli98 commented 9 months ago

The camera poses are camera to world. We use the same pose coordinate system as original NeRF repo: the local camera coordinate system of an image is defined in a way that the X axis points to the right, the Y axis upwards, and the Z axis backwards as seen from the image.

AIBluefisher commented 9 months ago

Thanks for your explanation. It would be better to make it clear on README.

AIBluefisher commented 9 months ago

I also have a question: for the aerial_street_fusion dataset, the camera poses in the aerial images seem not to share the same coordinate frame with the camera poses in the street images: aerial_steet_fusion I visualize the camera poses using COLMAP's GUI. As we can observe, the camera poses of the street images are higher than the aerial images. There must be a transformation to align the camera poses together. Or am I missing something?

yixuanli98 commented 9 months ago

Thanks for your explanation. It would be better to make it clear on README.

We write the coordinate system here https://github.com/city-super/MatrixCity?tab=readme-ov-file#data-download.

yixuanli98 commented 9 months ago

I also have a question: for the aerial_street_fusion dataset, the camera poses in the aerial images seem not to share the same coordinate frame with the camera poses in the street images: aerial_steet_fusion I visualize the camera poses using COLMAP's GUI. As we can observe, the camera poses of the street images are higher than the aerial images. There must be a transformation to align the camera poses together. Or am I missing something?

They are in the same coordinate system. Our coordinate system is not colmap system. So I think it is not reasonable to use the colmap GUI to visualize the camera poses. You can see the transforms_train.json. The height of street is 0.03 and the height of aerial is 2. The unit is 100m. You can read the explanation of our data structure https://github.com/city-super/MatrixCity?tab=readme-ov-file#data-structure.

yixuanli98 commented 9 months ago

Thanks for your explanation. It would be better to make it clear on README.

We write the coordinate system here https://github.com/city-super/MatrixCity?tab=readme-ov-file#data-download.

Thanks for you suggestion. We will open a new section to explain this thing to make it easier to notice

yixuanli98 commented 9 months ago

Thanks for your explanation. It would be better to make it clear on README.

We write the coordinate system here https://github.com/city-super/MatrixCity?tab=readme-ov-file#data-download.

Thanks for you suggestion. We will open a new section to explain this thing to make it easier to notice

We have updated the information here https://github.com/city-super/MatrixCity?tab=readme-ov-file#pose-file-structure.

AIBluefisher commented 9 months ago

Thanks for your explanation. It would be better to make it clear on README.

We write the coordinate system here https://github.com/city-super/MatrixCity?tab=readme-ov-file#data-download.

Thanks for you suggestion. We will open a new section to explain this thing to make it easier to notice

We have updated the information here https://github.com/city-super/MatrixCity?tab=readme-ov-file#pose-file-structure.

Thanks for your help and the quick update!

AIBluefisher commented 9 months ago

I also have a question: for the aerial_street_fusion dataset, the camera poses in the aerial images seem not to share the same coordinate frame with the camera poses in the street images: aerial_steet_fusion I visualize the camera poses using COLMAP's GUI. As we can observe, the camera poses of the street images are higher than the aerial images. There must be a transformation to align the camera poses together. Or am I missing something?

They are in the same coordinate system. Our coordinate system is not colmap system. So I think it is not reasonable to use the colmap GUI to visualize the camera poses. You can see the transforms_train.json. The height of street is 0.03 and the height of aerial is 2. The unit is 100m. You can read the explanation of our data structure https://github.com/city-super/MatrixCity?tab=readme-ov-file#data-structure.

Thanks for the confirmation. I converted the poses to COLMAP's format. There should be some other issues with my conversion code. I will try to fix it later.