roboticslab-uc3m / openrave-yarp-plugins

OpenRAVE plugins to interface OpenRAVE with YARP.
https://robots.uc3m.es/openrave-yarp-plugins/
GNU Lesser General Public License v2.1
3 stars 1 forks source link

Mesh render from real depth sensor data #70

Closed jgvictores closed 3 years ago

jgvictores commented 6 years ago

As described at #27:

programs/depthToOR gets a point cloud from a real sensor, creates a mesh, and puts it in an OpenRAVE environment. RECOVER

We'll probably want a plugin!

PeterBowman commented 6 years ago

Linking to that single commit, just in case: https://github.com/roboticslab-uc3m/openrave-yarp-plugins/commit/a7b3e06d6225c1bc6cb190fbc732cdfe7c2ed3e4.

jgvictores commented 5 years ago

Raising priority

PeterBowman commented 4 years ago

I have a depth sensor at my place, so this is a nice opportunity for me to get my first hands-on experience with PCL! Shamelessly taking this over.

PeterBowman commented 4 years ago

WIP at https://github.com/roboticslab-uc3m/openrave-yarp-plugins/commit/1ef72ecb5e5aea458bb680d5f0f2a5a9ecebc66e. This is quite slow: after an aggressive downsampling step (voxel leaf size of 0.05f), a single iteration takes about 5 seconds. Also, I can recognize nothing from the resulting mesh...

It looks like some performance gain should be achieved with enabled GPU support, which requires compilation from sources, though: https://pointclouds.org/documentation/tutorials/gpu_install.html.

Edit: sadly, I don't think it will actually help. Only a limited subset of PCL algorithms has been ported to (CUDA) GPU, and out of those, only normal estimation is of our concern (ref).

Edit 2: well, we can provide our own customizations to PCL+GPU, I guess. Later, an upstream PR would follow.

PeterBowman commented 4 years ago

For future reference, alternative algorithms to Poisson:

And a very interesting article expanding on those two methods plus Poisson: Keiser, A. et al, Mesh Reconstruction Using The Point Cloud Library.

PeterBowman commented 4 years ago

Since building a surface mesh is a time-consuming process, for ease of inspection (and better fps) I tried to restore the following lines which attempt to retrieve and render an array of spheres:

https://github.com/roboticslab-uc3m/openrave-yarp-plugins/blob/3cbdd51e8372f0f337722fcafdf535e6bc878bd8/programs/depthToOR/DepthToOR.cpp#L67-L87

Sadly and surprisingly, this is vastly less efficient than creating a surface mesh. The voxel grid 0.05 resolution for the downsampling filter produces a cloud of approx. 7000-7500 points. The sphere mesh didn't even work with a resolution of 0.1 (~1000 points), it had to be lowered down to 0.5 (~80 points) to actually produce a result.

Edit: KinBody::InitFromSpheres.

PeterBowman commented 4 years ago

This is quite slow: after an aggressive downsampling step (voxel leaf size of 0.05f), a single iteration takes about 5 seconds.

...when depth frame resolution is set to 1280x720. This duration is much lower with 640x480: 1.5-2 seconds. In this scenario, it takes 20-25 milliseconds to perform the downsampling and normalization steps, each, and 130-180 milliseconds for the OpenRAVE rendered to visually translate the data into a 3D surface mesh. The remaining 1.3-1.6 seconds are devoted for actually producing a surface from the points and normals by PCL. By the way, this surface is comprised of ~16000 points (vertices).

PeterBowman commented 4 years ago

...and 130-180 milliseconds for the OpenRAVE rendered to visually translate the data into a 3D surface mesh.

@jgvictores suggested disabling OR viewer (i.e. go headless). I see no difference, though.

BTW OpenRAVE::KinBody::InitFromMesh is cost-free. The OR mesh is not parsed until you actually add it to the environment via OpenRAVE::EnvironmentBase::Add.

PeterBowman commented 4 years ago

Hi, this is me sitting at my desk, you can probably recognize my healthy silhouette (mostly the head, torso and right arm):

Screenshot from 2020-11-04 20-35-41

This is the outcome of the greedy triangulation projection algorithm (0.01 voxel grid):

Screenshot from 2020-11-04 20-36-40

Now, this is me waving at the camera (left hand in front of my head) with Poisson:

Screenshot from 2020-11-04 20-35-08

PeterBowman commented 4 years ago

Design considerations:

I'm not sure if surfaceReconstruction should keep on generating meshes as new frames arrive. Alternatively, both server (surfaceReconstruction) and client (openraveYarpWorldClient) would need to connect via RPC port and follow a strict communication protocol (i.e. vocab-based commands). In short: either keep the generator app running and consuming resources, or establish a tight coupling between the surface generator app and an incidentally unrelated client app (we just initialize the mesh as an OpenRAVE structure and pass it on to the world server). I'd like the client to behave like "hey, let's listen to this port and ignore the internals of the remote app, I just need to know the correct data type(s)".

Edit: alternatively, make surfaceReconstruction a one-shot thing, i.e. 1. connect to remote camera, 2. generate surface, 3. save to file, 4. close. Then, perhaps we can just call openraveYarpWorldClient --file with that saved mesh.

PeterBowman commented 4 years ago

Concave hull with alpha = 0.01 (~70000 mesh points), it takes 3-4 seconds to reconstruct + 1.5 s for OR to paint it:

Screenshot from 2020-11-07 19-54-42

PeterBowman commented 4 years ago

Good news: OpenRAVE can open PLY files (only ASCII format, it doesn't seem to like binary PLYs).

Screenshot from 2020-11-17 19-53-37

PeterBowman commented 4 years ago

Edit: sadly, I don't think it will actually help. Only a limited subset of PCL algorithms has been ported to (CUDA) GPU, and out of those, only normal estimation is of our concern (ref).

Well, there actually exists a GPU-based implementation of marching cubes surface reconstruction used in kinfu: pcl::gpu::MarchingCubes (also for large scale kinfu). I might consider this, but currently I see no actual need for a high-rate mesh-producing algorithm.

PeterBowman commented 4 years ago

Point clouds generated from depth maps are said to be organized due to their image-like data layout, i.e. each point of the cloud is pretty much the same as a pixel of the depth frame that can be referred to by its row+column coordinates. In other words, an organized point cloud has a certain number of rows and columns, whereas an unorganized one by convention has a single column.

Organized clouds can be targeted by specific algorithms for improved performance thanks to their nature. In that vein, the pcl::OrganizedFastMesh takes advantage of this fact for efficient surface generation. On my machine, this method yields a mesh of ~180k faces from a 1280x720 depth frame (921600 pixels/points) in only ~260 ms. With 640x480 frames (307200 pixels/points), it can produce ~600k faces in 80 ms at full resolution, or 36k faces in 45 ms.

PeterBowman commented 3 years ago

Edit: alternatively, make surfaceReconstruction a one-shot thing, i.e. 1. connect to remote camera, 2. generate surface, 3. save to file, 4. close. Then, perhaps we can just call openraveYarpWorldClient --file with that saved mesh.

After much thought and some intense work on the vision repo, regarding a new KinFu-based application (https://github.com/roboticslab-uc3m/vision/issues/101) and a new library of point cloud-related utilities (https://github.com/roboticslab-uc3m/vision/issues/102), this edit is quite close to what the final product would be shaped like. However, there will be no additional service in the middle, just a plain function call that accepts a cloud and spits a mesh surface.

In short, I envision two ways to deal with this:

In both cases OpenraveYarpWorld accepts the yarp::sig::PointCloud (cloud of vertices) + yarp::sig::VectorOf<int> (vector of indices) duo. It is responsible for the conversion to OpenRAVE's TriMesh.

PeterBowman commented 3 years ago

The existing openraveYarpWorldClient app would open a .ply file and stream its contents once to OpenraveYarpWorld, thus adding the mesh to the environment.

Done at https://github.com/roboticslab-uc3m/openrave-yarp-plugins/commit/9805e4c80290c8aa4771c77a6364863322388093. Note that a .ply file opened with ReadKinBodyURI is rendered in black&white pretty much like in https://github.com/roboticslab-uc3m/openrave-yarp-plugins/issues/70#issuecomment-729132138. On the other hand, meshes loaded via ReadTrimeshURI (and read as KinBodies) have a similar look to https://github.com/roboticslab-uc3m/openrave-yarp-plugins/issues/70#issuecomment-723481640. I wonder if the latter signifies that collision checking is enabled.

PeterBowman commented 3 years ago

A new variant of openraveYarpWorldClient would connect to a remote depth sensor. Then, most work is meant to happen locally: converting the depth frame to a point cloud and reconstructing a surface mesh upon that. The result, once again, is sent to OpenraveYarpWorld.

WIP at https://github.com/roboticslab-uc3m/openrave-yarp-plugins/commit/e82a455f4456dab9239b40cfadd57688bbe97110. I have reimplemented the world mk mesh command in the server part so that it parses a bottle of mesh vertices and a bottle of mesh indices. It is intended for simple meshes, although the new openraveYarpWorldClientMesh already exploits this feature.

It's not efficient, though, and it takes a long time to render a mesh. I need to study valid downsampling strategies, then decide whether I'm going to keep sending parsed bottles or, ideally, bulk send a pair of yarp::sig::PointCloudXYZ (vertices) + yarp::sig::VectorOf<int> (indices). In the latter scenario, either provide an alternative connection reader to the existing port callback, or create a new port altogether specifically for handling meshes.

PeterBowman commented 3 years ago

I have applied my recent findings on the pcl::OrganizedFastMesh algorithm for another take on the live-mesh idea, so far discarded. With proper configuration (horizontal-vertical decimation every 15 pixels, triangle adaptive cut), on my machine it takes only 16-18 ms to reconstruct the surface mesh. Adding to that, another 30-33 ms to render this mesh in the OpenRAVE viewer, plus 13-20 ms for data conversion (most of that devoted to computing a point cloud from depth data). In this setup, processing HD frames (1280x720), it can work at much tighter refresh rates of 70 ms.

Edit: also perfectly fine at 640x480, decimation every 10 pixels and a refresh rate of 40 ms.

Edit2: an initial transformation step at 1280x720 consumes 6-7 ms.

PeterBowman commented 3 years ago

It's not efficient, though, and it takes a long time to render a mesh. I need to study valid downsampling strategies, then decide whether I'm going to keep sending parsed bottles or, ideally, bulk send a pair of yarp::sig::PointCloudXYZ (vertices) + yarp::sig::VectorOf<int> (indices). In the latter scenario, either provide an alternative connection reader to the existing port callback, or create a new port altogether specifically for handling meshes.

Downsampling through decimation boosts process times on the OR side. Improving the pipeline at https://github.com/roboticslab-uc3m/openrave-yarp-plugins/commit/ffdce04c8d87f5356261acba9e332f2468fb2c6d totally fixed this.

Commit https://github.com/roboticslab-uc3m/openrave-yarp-plugins/commit/66c048925a982b5f592177bfd5660f999d40be77 enables retrieving current robot+camera pose in order to apply the correct transformation to the point cloud. New configuration files for the corresponding DH parameters have been added at https://github.com/roboticslab-uc3m/teo-configuration-files/commit/2761ace368d66c4a517550790597618fd5f2fbf6. Still, it is possible to perform an additional transformation via --pos and --ori.

PeterBowman commented 3 years ago

I have applied my recent findings on the pcl::OrganizedFastMesh algorithm for another take on the live-mesh idea, so far discarded.

Implemented at https://github.com/roboticslab-uc3m/openrave-yarp-plugins/commit/eeabd1150376c0fb2754c787f69912c776b1de6d. I'm fetching the actual sensor transformation from TEO's XML via OpenRAVE environment (as opposed to starting a remote cartesian server, which is kinda silly as data also originates from OR itself). Check this video:

IMAGE ALT TEXT HERE

Please note I had to modify teo.robot.xml. More precisely, I added an additional local Z-axis rotation to <rotationaxis>0 1 0 90</rotationaxis>, thus resulting in <rotationaxis>0.5773503 -0.5773503 0.5773503 240</rotationaxis> which corresponds to the actual sensor axes (check also TEO diagrams). I am not changing the XMLs for now since this may impact other applications (@jgvictores).

Commands to reproduce (using a Realsense2 camera at home, this not the real robot's ASUS Xtion):

$ teoSim
$ yarpdev --context sensors --from realsense-d4xx.ini
$ openraveYarpPluginLoaderClient --device YarpOpenraveMeshFromRealDepth \
                                 --robotIndex 0 --depthSensorIndex 0 --remote /realsense2
$ yarp rpc /teoSim/head/rpc:i
PeterBowman commented 3 years ago

Please note I had to modify teo.robot.xml. More precisely, I added an additional local Z-axis rotation to <rotationaxis>0 1 0 90</rotationaxis>, thus resulting in <rotationaxis>0.5773503 -0.5773503 0.5773503 240</rotationaxis> which corresponds to the actual sensor axes (check also TEO diagrams). I am not changing the XMLs for now since this may impact other applications (@jgvictores).

I have confirmed this was a bug, fixed in https://github.com/roboticslab-uc3m/teo-openrave-models/commit/92d4de9ffa83125f5651f2e077bd3e65806a1eaf. By the way, the virtual depth sensor does not seem capable of perceiving the robot itself, i.e. I could only see environment objects on yarpview (table, mug, red can...) but not the robot arm lingering in front of the virtual sensor.

PeterBowman commented 3 years ago

Done at https://github.com/roboticslab-uc3m/openrave-yarp-plugins/commit/b757fd95a53ce100ff47bf50cf15095107ba56f4. To sum up: