Closed JiaxiXie closed 3 months ago
From the visualization, it looks like the camera intrinsics / extrinsics are incorrectly configured.
I haven't used ARKitScene, so I'm unsure what the camera conventions for that dataset look like. You also seem to be using the value png_depth_scale
from the Replica sequences. For most other datasets, that parameter is 1000
(except for TUM RGB-D, for which I believe it is 5000
).
Here is the dataset format of ARKITScene
/[scene_id]
/[scene_id]_frames
/lowres_depth (depth images)
/lowres_wide (color images)
/lowres_wide_intrinsics (camera intrisics)
/lowres_wide.traj (camera extrinsics)
lowres_wide
(color images)
lowres_depth
(depth images)
lowres_wide_intrinsics
(camera intrisics) is a single-line text file, space-delimited, with the following fields:
width height, focal_length_x, focal_length_y, principal_point_x, principal_point_y
Here is the specific intrinsic values for this dataset:
256 192 212.027 212.027 127.933 95.9333
lowres_wide.traj
is a space-delimited file where each line represents a camera position at a particular timestamp
For camera intrinsics, I think ARKITScene 3DOD does not provide a specific value for depth_scale. However I load other parameters as stated in the file.
For extrinsics, I use the following code to print all the camera pose when running concept-graphs/conceptgraph/slam/rerun_realtime_mapping.py
given stride=10
inrerun_realtime_mapping.yaml
unt_pose = dataset.poses[frame_idx]
print(f"==============dataset poses===============\n{unt_pose}, \n{frame_idx*10+1}”)
unt_pose = unt_pose.cpu().numpy()
Here is one example when processing line 261 in scene 40753679’s camera pose Print out is given below -
==============dataset poses===============
tensor([[-0.7846, 0.6184, 0.0439, 0.6561],
[-0.2921, -0.3062, -0.9060, 0.9303],
[-0.5469, -0.7237, 0.4209, 1.4225],
[ 0.0000, 0.0000, 0.0000, 1.0000]]),
261
The original line from lowres_wide.traj
in the dataset is given below:
Line 261: 6816.11971479 0.42390419235973026 1.3735961783384674 -2.116927535648726 0.65612 0.930267 1.42246
I did not see there is any error. Just wonder if that could mean that I am getting the right extrinsics
Just in case if there are any other ways to do verification, I also have .json for object annotation and .ply mesh(maybe helpful for verification?). I wonder if there are any other ways to verify. Thanks!
Quick comment on the visualization posted above. It seems like you are trying to use the ARKITScene dataset, but the rerun visualization seems to be from the Replica office0
sequence. Is that intentional? If not, there could be an error someplace in your config.
The preferred way to verify if the intrinsics and extrinsics are indeed correct is to visualize the 3D reconstruction produced by gradslam
(running the script here on the first 100-200 frames of your dataset, with a frameskip of ~25, should suffice; you wouldn't need to run on the full sequence). It is hard for me to tell by just looking at the dataset poses whether they're accurate, since I haven't used that dataset before.
Thanks for the suggestions!
office0
, sorry for the confusion, I am just trying to show what happened to my ARKITScene visualization. gradslam
, I still have some questions. When I run it I got this following error. I took a look at the original Pointclouds
code in gradslam repo and see that there is no attribute save_to_h5
. I wonder how can I resolve this.
line 118, in main pointclouds.save_to_h5(dir_to_save_map)
AttributeError: 'Pointclouds' object has no attribute 'save_to_h5'
The issue with gradslam
is perhaps due to using the main
branch? You will need to use the conceptfusion
branch (https://github.com/gradslam/gradslam/tree/conceptfusion) to access features essential for this repo.
Thanks a lot! It works. But the result is still pretty weird shown below. There are potential problems coming in mind:
IMO, printing out the camera pose isn't an issue (the rounding off happens when you print most arrays/tensors). You could try loading in double precision and then casting to float, but I don't think that'd be a major factor.
As for ARKitScenes, I've unfortunately got no experience working with it. I'd double check the intrinsics, camera poses (and axis conventions) carefully.
I looked up their documentation very quickly, and based on this description, the depth is in millimeters, so the depth scale parameter must be set to 1000.0
Thanks for the help during the week. I have resolved the problem by changing ways to load the camera pose. It really helps a lot. Also, thank you for providing such amazing code!
I am now trying to run concept graph on ARKITScene dataset, and encountered a few problems. I have finished watching the tutorial and successfully generated the mapping on rerun.io using Replica Dataset. Here are the changes I made for implementing concept map on ARKITScene dataset:
dataset/dataset_common.py
so that whenconfig_dict["dataset_name"] == 'arkitscene'
will do initialization with ARKITScene Dataset class.dataset/dataconfigs/arkitscene/arkitscene.yaml
file with following parametersI have done tests for all the above parameters by using them to generate Open3D simulation with only rgb_images and depth_images
hydra_configs/arkitscene.yaml
to modify the pathHere is the problem: from the clip from rerun.io below that there are overlapping of items. I am really not sure what causes those. I would definitely appreciate it if someone could share some opinions and suggestion on the problem. Thanks a lot.