The mobile phone vertical recording effect is very bad

wangshuailpp commented 1 year ago

Hi, when I record vertical iphone dataset with ios_logger , I get very bad result. Would planarreocn has some pose or rotation requirement? Following are the vertical recording image and result. 00000 WXWorkCapture_16660029427769 When I record horizontal iphone dataset, result is pretty good. Following are the horizontal image and result. Thanks! 00000 WXWorkCapture_16660032056825

ymingxie commented 1 year ago

Hi, you can use vertical images but do not rotate them by 90 degrees as shown in your post figure.

wangshuailpp commented 1 year ago

Thanks! 英文解释有点不太清楚，我直接用中文清晰的解释一下我的问题，其实不是我旋转的图像，我是把iPhone竖着拿录制的数据，如下图所示，这样录制数据得到的图像如问题中第一个图片所示，也就是竖着拿手机，录制的图像自动被旋转了90度，变成了横的，这是第一种情况。 d4320e2d-4763-45d8-a18a-bd6ec5b3f4a5 而如果是把iPhone横着拿的话，如下图所示，录制得到的就是正常的竖着的图像，就是问题中后一个图像，这是第二种情况。 f58b9970-614d-4683-9e32-2d0fd9676d9d 整体分析下来有两个现象：（1）planarrecon处理不了第一种情况，其结果比较差，但可以正常处理第二种情况（2）我尝试了neuralrecon，发现neuralrecon可以同时处理以上两种情况，所以怀疑是否和引入平面的约束以及tsdf的约束丢失有关，或者有其他需要特别处理的地方？感谢回复！

ymingxie commented 1 year ago

Hi, 可能是2d feature extraction的时候训练数据都是横着的图像，所以竖着的图像效果不太好。但是neuralrecon应该也会受这个影响而效果差一些，所以我也不太确定。不过还是可以尝试这么处理: 如果是自动转了90度，你可以试试在进2d backbone之前把图像转-90度，出来的feature map再转90度回来: https://github.com/neu-vi/PlanarRecon/blob/main/models/planarrecon.py#L83. 这样你back projection的内参/外参也不用变了。

wangshuailpp commented 1 year ago

感谢回复！尝试在backbone之前旋转图像然后把features再旋转回来，横竖屏效果基本上没有变化，所以应该和2D特征提取没有关系。继续发现还存在第三种情况，即在第二种情况下，上下运动手机录制数据（第二种情况主要是左右运动手机），重建平面效果也非常差，希望可以得到更多的解决思路！

nbansal90 commented 1 year ago

Hey @wangshuailpp! Were you able to zero down on the issue, which you saw? I think i am stumbling on the same issue. I will try the workaround that you suggest and dig through the code to understand if there is something that looks not correct.

Meanwhile, would be great to know what are your findings.

PS: BTW I am using Android/Arcore, for my setup, and have taken care of the necessary details during the data processing steps.

nbansal90 commented 1 year ago

Hey @ymingxie,

Do we need to take care for differnece in the refernece coordinate axis difference between the pose given in ScanNet and that which is generated by ARKit/Arcore ?
As per my understanding, Coordinate axis for ScanNet is X(right), Y(front) and Z(up) where as for Arcore/ArKit it is X(right), Y(Up) and Z(back)(which OpenGL convention.)

Regards, Nitin

ymingxie commented 1 year ago

@wangshuailpp @nbansal90 Hi, you can check the camera coordinate and the world coordinate from ARCore, which need to be consistent with the corresponding coordinates in ScanNet: The camera coordinate in ScanNet is OpenCV coordinate (X: right, y: down, z: forward). And in the world coordinate of ScanNet, XY plane is parallel to the ground and +Z axis is up direction (inverse gravity-aligned coordinate).

To leverage the prior of gravity we rotate the camera coordinate to a gravity-aligned camera coordinate (https://github.com/neu-vi/PlanarRecon/blob/main/datasets/transforms.py#L65). The world coordinate matters because we assume the -Z direction in the world coordinate is the gravity direction. The camera coordinate matters because both of the predefined anchors and predicted planes are in the gravity-aligned camera coordinate.

You'd better make the world and camera coordinate consistent with the coordinate in ScanNet. Otherwise, you may get worse results.

Feel free to comment here if you guys have more findings :)

nbansal90 commented 1 year ago

Thanks @ymingxie for the response and the pointer!

Pushkar2307 commented 1 year ago

Hi @nbansal90, were you able to get the solution working on arcore data?

nbansal90 commented 1 year ago

@Pushkar2307 Even for me Once I rotate the android phone and capture the data and run through the pipeline, I get a fair result (as observed by wangshuailpp).

neu-vi / PlanarRecon

The mobile phone vertical recording effect is very bad #11