1zb / 3DILG

3DILG: Irregular Latent Grids for 3D Generative Modeling
https://1zb.github.io/3DILG/
Other
89 stars 5 forks source link

What exactly the x(or say surface) represents in forward function of Autoencoder #8

Open zhanghm1995 opened 1 year ago

zhanghm1995 commented 1 year ago

Hi, when I read this code, I found there are two inputs x and points in Autoencoder. I found the x is the surface variable, so what it exactly mean?

Since the reconstruction task just needs point cloud input, why do we need the "surface" as one of inputs?

Maybe I misunderstand something as I cannot download so a large dataset.

1zb commented 1 year ago

There are two kinds of points in the reconstruction task 1) surface points (x) 2) query points (points) To reconstruct the surface, we want to be able to obtain the labels (or SDFs) for any query points in 3D space. Then we can apply an iso-surface extraction method (e.g., Matching Cubes) to get the desired 3D polygonal mesh.

zhanghm1995 commented 1 year ago

Thanks for your reply.

So it means you use the surface points (x) in encoder to get the embedding of centers in surface points, and when decoding, you use the query points (they contain vol_points and near_point, right?), combined with the sampled centers embedding features in surface points, to obtain the classification labels in 3D space.

However, you use all the query points in the decoder, do these query points offer gt information to make the classification labels learning more easier? Since these query points already contain the occupancy information in 3D space.

Maybe I misunderstand something.

BTW, if I just want to learn an AutoEncoder model that reconstructs the input point cloud itself, how can I do this?

1zb commented 1 year ago

The learning of neural fields (a.k.a., neural implicit representations, coordinate-based networks) is to represent shapes with a function (an MLP in this case).

  1. In testing, we are able to query occupancy given any query point
  2. In training, we need ground-truth occupancies (they are our main optimizing target)

For your case (I assume it's a point cloud autoencoder), it's totally different from our task. However, you can still try to reuse our point cloud encoder. Then build a decoder with upsampling on top of it.

We will release a subset of the datasets. We use OccNet's repository to convert shapenet models to watertight ones first. Then we sample on the surfaces to get point cloud representations of models and sample labeled (inside/outside) points in the bounding volume. However, if you are only interested in point clouds, just use any polygonal mesh processing software (e.g., trimesh) to do surface sampling.