ChrisWu1997 / PQ-NET

code for our CVPR 2020 paper "PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes"
MIT License
116 stars 19 forks source link

Some questions about generating results #13

Closed yangjob closed 3 years ago

yangjob commented 3 years ago

Hello, I am a newbie in this direction. Your article is very well written. I have a question about the generation of results, that is, about the implicit decoder, which is just a classifier, so how are the coordinates of the real points generated? thanks

ChrisWu1997 commented 3 years ago

The naive way, which is what we did, is to simply query every grid point under certain resolution. For example, say we set the resolution to be 64^3, then we just take all the 64^3 grid points in the voxel and input to the decoder.

Of course, this would cause many unnecessary queries, so a more efficient way is proposed in OccNet. And I believe nowadays there are many other works that try to improve the efficiency of implicit network.

yangjob commented 3 years ago

The naive way, which is what we did, is to simply query every grid point under certain resolution. For example, say we set the resolution to be 64^3, then we just take all the 64^3 grid points in the voxel and input to the decoder.

Of course, this would cause many unnecessary queries, so a more efficient way is proposed in OccNet. And I believe nowadays there are many other works that try to improve the efficiency of implicit network.

ok, I got it.

yangjob commented 3 years ago

In other words, after the bounding box is generated, it is gridded, and then the coordinates of each voxel are calculated separately and then sent to the decoder, right? Does the final result need to be consistent with the number of points during training?Thank you for your patient reply.

ChrisWu1997 commented 3 years ago

In other words, after the bounding box is generated, it is gridded, and then the coordinates of each voxel are calculated separately and then sent to the decoder, right?

Partly right. Each part is generated separately in its own local space by the decoder using 64^3 points, so every part is within a 64^3 box. Then we use the predicted bounding box to transform each part to its correct position in the global space.

Does the final result need to be consistent with the number of points during training?

Not necessarily. The part autoencoder(implicit decoder) is trained under 64^3 resolution, but you can use any resolution in testing time, because point coordinates are normalized into [0, 1] before putting into network. This is essentially a good property for implicit neural representations, you can refer to DeepSDF, IM-NET, OccNET, etc.

yangjob commented 3 years ago

In other words, after the bounding box is generated, it is gridded, and then the coordinates of each voxel are calculated separately and then sent to the decoder, right?

Partly right. Each part is generated separately in its own local space by the decoder using 64^3 points, so every part is within a 64^3 box. Then we use the predicted bounding box to transform each part to its correct position in the global space.

Does the final result need to be consistent with the number of points during training?

Not necessarily. The part autoencoder(implicit decoder) is trained under 64^3 resolution, but you can use any resolution in testing time, because point coordinates are normalized into [0, 1] before putting into network. This is essentially a good property for implicit neural representations, you can refer to DeepSDF, IM-NET, OccNET, etc.

ok,thank you very much