Open ThomasParistech opened 2 years ago
The dists
attribute is the 2D distance of each pixel to the corresponding 3D point (indexed by the appropriate fragments attribute). Not sure what the error is here? What would you expect to see?
1) My bad. I wasn't sure about the meaning of the attribute 'dists'. And since it lies in [1e-9, 1e-4] I first thought it might be an error, which could explain the wrong rendered image
2) My real issue is that I don't manage to properly set znear and zfar. I expect to see only red points on the image. I assumed that setting the znear and zfar parameters of the camera would define the vision frustum. Why do I still see points closer than znear in the rendered image ?
My real issue is that I don't manage to properly set znear and zfar. I expect to see only red points on the image.
znear
and zfar
define the camera transform. So you should check what the meaning of znear/zfar is in that camera. Different cameras have different definitions. This means that you should check the camera definition you are using in the code and answer your question. I could look for you and give you the answer but I think that's an exercise users should do because it directly affects your project and having a complete understanding of it is important. In general, it's all math and the answer is in the code!
I thought the definition of znear and zfar was consistent with the depth value returned by zbuf I'll dive deeper into the camera definition then !
Unfortunately, there's an ambiguous TODO in the code and the pointclouds are not passed as they're supposed to be.
When I call the PointsRasterizer::forward on a pointcloud, it first converts the pointcloud to the NDC space using the camera model (PointsRasterizer::transform).
But this transform method overwrites the NDC depth using the Z from the camera space instead.
pts_ndc[..., 2] = pts_view[..., 2]
def transform(self, point_clouds, **kwargs) -> torch.Tensor:
"""
Args:
point_clouds: a set of point clouds
Returns:
points_proj: the points with positions projected
in NDC space
NOTE: keeping this as a separate function for readability but it could
be moved into forward.
"""
cameras = kwargs.get("cameras", self.cameras)
if cameras is None:
msg = "Cameras must be specified either at initialization \
or in the forward pass of PointsRasterizer"
raise ValueError(msg)
pts_world = point_clouds.points_padded()
# NOTE: Retaining view space z coordinate for now.
# TODO: Remove this line when the convention for the z coordinate in
# the rasterizer is decided. i.e. retain z in view space or transform
# to a different range.
eps = kwargs.get("eps", None)
pts_view = cameras.get_world_to_view_transform(**kwargs).transform_points(
pts_world, eps=eps
)
# view to NDC transform
to_ndc_transform = cameras.get_ndc_camera_transform(**kwargs)
projection_transform = cameras.get_projection_transform(**kwargs).compose(
to_ndc_transform
)
pts_ndc = projection_transform.transform_points(pts_view, eps=eps)
pts_ndc[..., 2] = pts_view[..., 2]
point_clouds = point_clouds.update_padded(pts_ndc)
return point_clouds
From that, the PointsRasterizer calls the rasterize_points on a poincloud that has x,y in NDC but z in camera space.
def forward(self, point_clouds, **kwargs) -> PointFragments:
"""
Args:
point_clouds: a set of point clouds with coordinates in world space.
Returns:
PointFragments: Rasterization outputs as a named tuple.
"""
points_proj = self.transform(point_clouds, **kwargs)
raster_settings = kwargs.get("raster_settings", self.raster_settings)
idx, zbuf, dists2 = rasterize_points(
points_proj,
image_size=raster_settings.image_size,
radius=raster_settings.radius,
points_per_pixel=raster_settings.points_per_pixel,
bin_size=raster_settings.bin_size,
max_points_per_bin=raster_settings.max_points_per_bin,
)
return PointFragments(idx=idx, zbuf=zbuf, dists=dists2)
According to the doc, _rasterizepoints expect NDC z in [-1,1]. (even [0,1] since znear maps to 0 and zfar to 1)
def rasterize_points(
pointclouds,
image_size: Union[int, List[int], Tuple[int, int]] = 256,
radius: Union[float, List, Tuple, torch.Tensor] = 0.01,
points_per_pixel: int = 8,
bin_size: Optional[int] = None,
max_points_per_bin: Optional[int] = None,
):
....
Args:
pointclouds: A Pointclouds object representing a batch of point clouds to be
rasterized. This is a batch of N pointclouds, where each point cloud
can have a different number of points; the coordinates of each point
are (x, y, z). The coordinates are expected to
be in normalized device coordinates (NDC): [-1, 1]^3 with the camera at
(0, 0, 0); In the camera coordinate frame the x-axis goes from right-to-left,
the y-axis goes from bottom-to-top, and the z-axis goes from back-to-front.
....
I can't see the implementation of _*_C.rasterize_points(args) but in its naive python counterpart rasterize_pointspython** there's a check on the z value to see if it's visible. I don't understand how it could possibly work since the z value isn't in the NDC space.
points_packed = pointclouds.points_packed()
....
px, py, pz = points_packed[p, :]
if pz < 0:
continue
By the way, I tried crazy values for znear and zfar in the example above and I always get the same rendered image with green on top and red dots below. Which makes me think that it's not a just a matter of properly tuning the value (x2, scale...etc)
You were completely right in telling me to look a the math/code, but here it looks like znear and zfar parameters are ignored during the clipping due to pts_ndc[..., 2] = pts_view[..., 2]
Am I still missing a point? :sweat_smile:
@gkioxari Since in practice only negative depths are pruned out (equivalent to znear=0, zfar=+inf), we can shift the orthographical camera by a length znear - epsilon along the front vector, render the image and filter out pixels at which zbuf is larger than _zfar - znear + epsilon
It's not very clean but it does the job... ;)
N.B. Of course it works only for orthographic cameras
Are you planning to update the z coordinate convention used in the rasterizer to make orthographic znear and zfar clip work?
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.
Has this problem been resolved? I'm also confused that
# pyre-fixme[16]: Module pytorch3d has no attribute _C.
idx, zbuf, dists = _C.rasterize_points(*args)
Thanks for your great library :+1:
🐛 Bugs / Unexpected behaviors
I tried to write my custom PointsRenderer class to render a colored pointcloud with an fov orthographic camera. But it turns out the tensor contained in the dists attribute computed by the rasterizer only contains zero or -1 values. On the other hand the zbuf attribute is alright.
The z_near and z_far also don't seem to have any effect.
Instructions To Reproduce the Issue:
My specs: pytorch3d==0.6.1 torch==1.10.0+cu111 torchaudio==0.10.0 torchvision==0.11.1+cu111
In the example below, the camera is located at (0,2,0), looks towards -Y and there are a red plane at Y=-1 and a green plane at Y=1. Even though I explicitly ask for z_near=2.0 I still see the green plane in the rendered image.
Here's the output