nywang16 / Pixel2Mesh

Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images. In ECCV2018.
http://openaccess.thecvf.com/content_ECCV_2018/papers/Nanyang_Wang_Pixel2Mesh_Generating_3D_ECCV_2018_paper.pdf
Apache License 2.0
1.66k stars 295 forks source link

Questions about models.py and fetcher.py #74

Open TowerTowerLee opened 4 years ago

TowerTowerLee commented 4 years ago

Thank you for sharing this code and your work! I would like to ask the following questions:

  1. Under the build_cnn18 function in models.py, there is x=self.placeholders['img_inp']while I see img_inp, y_train, data_id = data.fetch() in train.py and image,point,normal,_,_ = data.fetch() in fetcher.py respectively.I understand that the input to the build_cnn18 function should be images. What are the representations of point and normal,It seems that y_train and data_id do not represent them.

  2. I don't understand the meaning of x=tf.expand_dims(x, 0)if x=self.placeholders['img_inp']represent the 'images'in build_cnn18 function?

  3. In this paper, only conv3_3, conv4_3, conv5_3 are concatenated. Why are four eigenvectors concatenated here in models.py,self.placeholders.update({'img_feat': [tf.squeeze(x2), tf.squeeze(x3), tf.squeeze(x4), tf.squeeze(x5)]}) Sincerely hope to get your answer~

walsvid commented 4 years ago

Hi @TowerTowerLee, thanks for your interest.

  1. y_train contains pointcloud coordinates xyz and normal vector, so point and normal are represented by y_train.

  2. Since our code does not have batch size, or batch size = 1, the input image is a three-dimensional tensor. For tensorflow's conv2d, the input should be a four-dimensional tensor, so we expand the dimension.

  3. In actual implementation we find that the performance of using these layers will be better.