Open ghost opened 2 years ago
Are you using Python? Here's a potential quick fix:
frustum_block_coords = vbg.compute_unique_block_coordinates(depth, intrinsic, extrinsic.to(o3d.core.Float64))
Similarly, if extrinsic
is a numpy type, you can do extrinsic.astype(np.float64)
.
This can be improved within the C++ code to handle both Float32
and Float64
@theNded .
Thanks for the reply, yes I'm using Python.
As mentioned, I've tried casting between the two dtypes several ways (including the ones you mentioned), however casting it to Float64 breaks it, leaving me with a single point in the middle of the scene. Float32 works fine and can be used when creating an individual point cloud, but it cannot be used in the pipeline due to the Float64 requirement for the extrinsic.
I've also quickly tried building Open3D from source and just deleting the assertions for Float64 (or adding a support for Float32) in Utility.h
, but it seems like it's used further down the pipeline, so it would require a bit more work to implement this.
EDIT: When reading the poses from the file, they are in Float64 by default. I need to cast them to Float32 to make it work with the PointCloud (which I can't do with the pipeline). Creating tensors of the poses (only the translation values/coordinates) with both dtypes returns the same values (but only Float64 works):
(Same result with o3d.core.float64/32)
EDIT 2: Since I can pass Float32 into an individual point cloud, here's what it looks like when reading the poses (which are in Float64): --> i.e. just a single point
and after casting it to Float32: --> i.e. correct trajectory
Thus, I need to cast the Float64 poses into Float32 when I want to visualise them in a point cloud. I wanted to do the same trick with the reconstruction pipeline, but as mentioned, the reconstruction pipeline only accepts Float64.
The setups are very confusing. Specifically, I don't understand:
My understanding is that there are two problems. One with the pose conversion, and one with the tensor reconstruction. I assume something goes wrong with the pose visualization that leads you to believe the conversions are wrong. Please elaborate your setup step by step, better show commented code snippets.
Sorry for the confusion, let me try again:
I'm reading the poses from a text file, by default, the dtype is Float64 (just reading a file into a numpy array)
I've noticed an issue with the reconstruction pipeline, where the camera (given by the pose) wasn't "moving", only rotating - this produced overlapping point clouds of the images inside the voxel block grid. Here I followed the documentation, here is a simplified snippet just in case:
img_left = np.array(Image.open(imgs_left[i]))
img_right = np.array(Image.open(imgs_right[i]))
disp, _ = Elas().process(img_left, img_right) # Computer disparity map
depth = ...# ... Some processing of the disparity map, converting into a depth map....
depth = o3d.t.geometry.Image(depth)
intrinsic = load_intrinsic() # Using the Open3D json config gile intrinsic = o3c.Tensor(intrinsic, dtype=o3c.Dtype.Float64)
extrinsic = o3c.Tensor(np.linalg.inv(poses[i]), dtype=o3c.Dtype.Float64)
frustum_block_coords = vbg.compute_unique_block_coordinates(depth, intrinsic, extrinsic, config['depth_scale'], config['depth_max'])
color = o3d.t.io.read_image(imgs_left_color[i]).to(device) vbg.integrate(frustum_block_coords, depth, color, intrinsic, extrinsic, config['depth_scale'], config['depth_max'])
- To understand what's going on, I wanted to visualise only the camera trajectory/poses to see whether it is really the case that the camera is "not moving" - I've extracted only the **translation values** from the poses and created a _separate_ point cloud to visualise the trajectory (images in previous post):
```Python
pcd = o3d.t.geometry.PointCloud(device)
pcd.point["positions"] = o3c.Tensor(np.array(poses)[:20, :3, 3], dtype=o3c.Dtype.Float64, device=device)
# Takes first 20 poses, first three rows, third column i.e. the translation values only
# Here I sticked with Float64
With the poses in Float64 (default dtype), they "collapse" into a single point, as you can see in the image above. Thus, when using these poses inside the reconstruction pipeline, it is the case that the camera only rotates essentially in one place and the images point clouds overlap.
If I cast the poses into Float32 at any point using any approach (numpy or Open3D directly), and visualize the trajectory again with the snippet above, it's correct, as seen in the image above again (each point is one pose)
--> However, I cannot pass these casted Float32 poses into the pipeline, as it accepts only Float64
Just to filter out possible suggestions:
Hope this helps @theNded
Thanks for the clarification. Let's figure out the pose issue first, as I cannot reproduce the issue with my own pose files also loaded into numpy. Could you please provide the pose file and the snippet to load the poses?
No problem The pose file is in the 3x4 matrix format; it's for the KITTI odometry dataset (sequence 7) and the file is produced by ORB-SLAM2 (built-in function for KITTI):
Here is the function for reading the poses:
def load_poses():
poses = []
f = open("../lib/orb-slam2/CameraTrajectory.txt", "r")
lines = f.readlines()
for line in lines:
pose = np.fromstring(line, sep=' ').reshape(-1, 4)
pose = np.vstack((pose, [0.0, 0.0, 0.0, 1.0]))
poses.append(pose)
return np.array(poses) # Changed to numpy for easier indexing, the issue was the same with just a list
Here is again the code for creating the point cloud I'm using:
poses = load_poses()
device = o3c.Device("CPU:0")
pcd = o3d.t.geometry.PointCloud(device)
pcd.point["positions"] = o3c.Tensor(np.array(poses)[:20, :3, 3], dtype=o3c.Dtype.Float64, device=device) # First 20 poses
o3d.t.io.write_point_cloud("./point_clouds/pointcloud.pcd", pcd)
# Changing the dtype in either the numpy array or in the tensor above produces the same result
# The trajectory is shown correctly with Float32
@theNded Sorry for the ping/bump, I'm sure you'd let me know in this thread if there was any update regarding this issue, I just want to make sure as this is a somewhat critical part of my project I have a strict deadline for, and I would have to look for a different solution. With that being said, I'll try to look into the issue directly myself and potentially submit a pull request, but right now, I have to focus on other aspects of the project. Do you think I can expect a fix/further information about this anytime soon?
Sorry for the delay, I have been very busy (I also have strict dues), and this issue somehow fell out of my inbox...
I tried to load your trajectory and all the snippets below produce reasonable visualization:
pcd = o3d.t.geometry.PointCloud(o3c.Tensor(poses[:20, :3, 3]))
or
pcd = o3d.t.geometry.PointCloud(o3c.Tensor(poses[:20, :3, 3], dtype=o3c.Dtype.Float64))
or
pcd = o3d.t.geometry.PointCloud()
pcd.point['positions'] = o3c.Tensor(poses[:20, :3, 3], dtype=o3c.Dtype.Float64)
or
pcd = o3d.t.geometry.PointCloud()
pcd.point['positions'] = o3c.Tensor(np.array(poses)[:20, :3, 3], dtype=o3c.Dtype.Float64)
Full trajectory also makes sense:
Tested on 0.14.1 and 0.15.2.
Thanks for getting back to me and testing the poses.
After I tried all the alternatives you've mentioned without success, I've noticed I was using the legacy system for reading the point cloud before visualising it, i.e.
o3d.io.read_point_cloud(...)
instead of
o3d.t.io.read_point_cloud(...)
With the tensor one, the visualisation looks correct when using Float64. I assume these conflicts between the legacy and tensor systems are expected to occur, still it might be worth implementing a check/warning message for such cases?
Thanks for the help!
This should not happen in theory, @reyanshsolis are there compatibility issues when writing double to a pcd with t.io and reading it with io?
Checklist
master
branch).My Question
Hi, I'm using the tensor reconstruction system and I've noticed the transformation from the poses are not passed in correctly. After some digging, I've noticed the reconstruction system accepts only Float64 extrinsic values:
My transformations are in Float64, but with this dtype, all of the values "collapse" into a single point and the camera just rotates in one place (since only the rotation values in the extrinsic are correct).
If I create a PointCloud with the transformation values passed in as the positions casted into Float32, I can see the pose graph just fine, with Float64 I only see one point. As mentioned, I cannot pass Float32 into the reconstruction pipeline, namely when calling:
frustum_block_coords = vbg.compute_unique_block_coordinates(depth, intrinsic, extrinsic)
(even with the dtype Float64 specified)Any solution/workaround for this?