RobotLocomotion / drake

Model-based design and verification for robotics.
https://drake.mit.edu
Other
3.34k stars 1.27k forks source link

Provide ray-casting query in QueryObject #15260

Open SeanCurtis-TRI opened 3 years ago

SeanCurtis-TRI commented 3 years ago

A lidar sensor is best simulated using ray casting. Furthermore, the ability to find out "what lies in that direction" is generally useful. Currently, QueryObject does not provide any such query. This would add such an API.

Problem statement

In its simplest form, the query would take a given ray (point origin + direction vector) and report what the first geometry the ray would intersect (if any). A more general algorithm would take an ordered list of rays and return an ordered list of results.

One of the Drake-specific flavors that would prove interesting is limiting the domain of geometry: e.g., just proximity geometry, just perception geometry, just a specific collection of geometries, etc. It would be good to provide an API that allows for limiting the possible objects to collide with.


Everything below this line is still in flux. The issue was created to support OKRs, but the proposed details are still very much T.B.D.

Thoughts:

Proposed API

  /** Represents a ray; a partial line segment with an origin and a direction.
   A ray is a quantity that depends on a frame. When defining a ray, the ray's
   origin must be measured and expressed in the same frame as the direction is
   expressed. In notation, the ray's frame should be indicated: e.g.,
   `Ray<T> ray_W{Vector3<T>::Zero(), Vector3<T>::UnitX()};`. */
  template <typename T>
  struct Ray{
    Vector3<T> origin;
    Vector3<T> direction;
  }

  /** The result of calculating geometry-ray intersection. The id of the
   intersected geometry, the distance from the ray's origin, and the
   intersecting point on the geometry. If there is no intersection, `id` will
   be std::nullopt and neither `distance` nor `point_on_geometry` will have
   meaningful values. */
  template <typename T>
  struct RayIntersection {
    std::optional<GeometryId> id;
    T distance{};
    Vector3<T> point_on_geometry;
  };

template <typename T>
class QueryObject {
...
  // TODO: Change this API to support perception.
  /** For each ray in `rays`, determines the first intersection between that
   ray and a set of operational geometries and reports relevant data related to
   that intersection.

   The "set of operational geometries" is drawn from those geometries with the
   proximity role assigned. The set can be reduced further by providing a
   GeometrySet. Any GeometryId in the set that doesn't have the proximity role
   will simply be ignored. */
  std::vector<RayIntersection<T>> ComputeRayIntersections(
      const std::vector<Ray>& rays,
      const GeometrySet& geo_set = GeometrySet{}) const;
...
};

Implementation plan

  1. We need to have Drake-side geometry Bvh.
  2. Build Bvh on perception geometry (see #15261).
  3. Implement all the ray-geometry methods.
sherm1 commented 3 years ago

A max_distance or threshold parameter (as we have for some of the distance queries) would likely be useful for performance reasons here also. For robotics it's likely that objects beyond a certain distance are irrelevant and in a home environment there are probably a lot of those.

SeanCurtis-TRI commented 3 years ago

Agreed.

DamrongGuoy commented 2 years ago

@joemasterjohn is interested. Add him to assignees.

SeanCurtis-TRI commented 1 year ago

Capturing some thoughts triggered by a recent conversation:

Simply doing raycasting would be the basic barebones API. At a higher level, it would be convenient to be able to simply declare a "lidar sensor" and the geometry that would be "visible" to that sensor.

For perception geometry, we have a mechanism of stating which render engine is associated with a particular geometry. The lidar sensor is analogous to the rgbd sensor so it would be convenient if the same abstraction applied here.

Things get a bit weird because the RGBD sensor is very much a rendering operation (with a supporting RenderEngine implementation); ray casting is a proximity query with no such render engine. So, it the line becomes quite fuzzy. One wouldn't associate it with a "render engine", per se.

With raycasting, it's almost like it's a different role. It could be higher-resolution than what we'd want for collision-related proximity queries, but not the same resolution as we'd want for rgb cameras.

Something to ponder.

jwnimmer-tri commented 1 year ago

My intuition re: the modeling choices would go like this:

Separately, if we want to do ray casting for motion planning that would use the proximity geometry.

So the question is -- is this ticket a feature request for bare-bones raycasting, or is actually a request for a lidar sensor? They are different things, and I think we'd go about them differently.

calderpg-tri commented 1 year ago

The motivation at TRI is simulating lidar sensors, so that's more important than just providing a basic raycasting query.

I think the strongest argument for trying to provide a performant generic raycasting query (presumably in the form of a batch query) is to make it easier to simulate more different lidar sensors down the road. "lidar sensor" here really covers a wide variety of sensor capabilities: 2D planar, "2.5D" multi-beam, 3D, camera-like FOV, 360-degree coverage, non-repeating beam patterns, etc; so "simulate a lidar sensor" is really a family of different lidar simulations, all of which use raycasting.*

*It's very likely we will be using multiple different lidar types on one robot in the near future, definitely including models with non/semi-repeating beam patterns.