Image Encoder: Hourglass Architecture

ckxz commented 3 years ago

Awesome work @shunsukesaito.

I have some doubts regarding the implemented architecture of the Image Encoder for surface reconstruction. According to your paper, its architecture is based on the Hourglass Net proposed in this paper. From my understanding, Hourglass Nets are a type of convolutional encoder - decoder framework, but I am not able to distinguish the "decoding" operations in your code: no nearest neighbor upsampling nor transposed convolutions.

My assumption here is the Image Encoder is implemented as a stack of several Hourglass Net halves, i.e. with their corresponding encoding convolutions only, feeding the obtained feature matrix directly to the Continuous Implicit Function (MLP). Is that correct?

Cheers.

GTO2013 commented 3 years ago

Hi, I am currently trying to understand PiFu better as well. The upscaling is done via

up2 = F.interpolate(low3, scale_factor=2, mode='bicubic', align_corners=True)

in HGFilters.py I think. If you didnt upscale it afterwards, you could not reference the feature vector for each pixel.

ckxz commented 3 years ago

Oh that makes much more sense now! Thanks for pointing that out @GTO2013, I had disregarded that line for some reason. Closing the issue.

shunsukesaito / PIFu

Image Encoder: Hourglass Architecture #84