Open keli95566 opened 1 year ago
Hi, it is true that Record3D and the underlying ARKit may fail to get the correct camera poses under circumstances (e.g., reflective objects). Have you visualized the camera poses of these scenes and do they make sense?
Thank you very much for getting back to me! I reduced the volume size to 2 and was able to find some parts of the recorded scene in the volume. Besides the issue with specular surfaces, It seems that the center of the volume box is placed at the first camera pose, rather than estimating the center of the recorded scene. I will try to record with the first camera pose facing the object of interest, and see if I get the same results. :)
For the camera pose issue, can you try to circulate around the object/scene of interest so that all your cameras are facing toward the object/scene? The script currently sets the intersection of all images' center rays to be the origin.
The camera pose issue is solved by your method. Thank you for the help!
@yenchenlin
I also re-calculated the camera poses with COLMAP and did a side-by-side comparison of the two rendering results. Render results with poses from COLMAP has higher quality than render results with ARkit pose estimation. I have not tested textureless scenes yet, but you are right, it seems that for some textured scenes, there is a trade-off to make here.
I wonder if using a camera sensor with its' own inside-out tracking will yield better pose estimation than ARkit?
Thanks for trying it out! This is aligned with my observation but actually, we can get "the best of both worlds" here by initializing COLMAP with ARKit/Record3D's camera poses. From my experience, this can prevent COLMAP from completely failing while improving the qualities of ARKit's poses.
I have a script for that and would love to try it on this scene. Do you mind sharing the data of this capture?
That is a really good idea! Here is the link to the data: https://drive.google.com/drive/folders/1GcFt4-bmpi-zHC5VIkNODfqtnY-bS_6E?usp=sharing Would you mind making a PR if it works? I am very curious to see how long it takes to go from the coarse pose to refine pose with COLMAP. Thank you very much!
If you're using an iOS device to capture this dataset, bear in mind that the intrinsics change with every frame due to the optical stabilization. The transforms.json allows you to overwrite the intrinsics on a per frame basis. Using the depth from the lidar makes a huge difference to the reconstruction.
@keli95566 Sorry I haven't had time to put together a PR. It's still on my to-do list and hope this doesn't block your progress. @jc211 Do you find using different intrinsics for each image help?
Hi @yenchenlin I also have a couple of captures where it's clear that there are 'ghosts' in the NeRF due to drift in the VIO pose estimation. I was thinking about developing something that would combine the quality of COLMAP with the metric scale and uprightness of the record3d poses. I have actually developed something like this before, but it would not be a trivial task to port this over to these other data formats and coordinate system conventions. So if you are willing to share your script, that would be fantastic. Aside from that, let me know if you are interested in my (outdoor) datasets that exhibit the phenomenon, happy to share for testing purposes.
Hi @yenchenlin I have an observation that I got better result when I removed the normalize_transforms procedure.
My scene is like an inward surrounding scene, the camera position differs a lot when without norm or with norm . The camera pose was shown with the unit box as follow :
without normalization |
with normalization |
---|---|
After training ~2w steps with extrinsics optimization, the without norm
result was better than with norm
, especially on the 'ghost' issue.
@yenchenlin What did you think about when designing this normalize_transforms
part ?
@jc211 I tried use different intrinsics for each image, but I got error about loading data. Did you make any change about the loading code?
Hi @yenchenlin , would you mind sharing the script initialising COLMAP with ARKit/Record3D's camera poses here? Thanks!
Hi all, please run the following steps:
Unzip the dataset
unzip ba_example.zip
Run the notebook bundle_adjustment_colmap.ipynb
when your folder looks like the follow:
├── bundle_adjustment_colmap.ipynb
└── ba_example
Then you should be able to run instant-ngp by treating ba_example
as a dataset!
Hi there! I noticed that now we could read the camera poses from record 3D and skip running COLMAP. However, after following the new data preparation tip, I tried a few record 3D scenes, but they were not converging to a sensible scene like those running with COLMAP. (I also tried rotating the images )
Any further tips on how we could run record 3D scenes correctly?