How the Model Can Achieve Ability to Conditionate on Controllable Params (shape, appearance, etc)

autonomousvision / giraffe

This repository contains the code for the CVPR 2021 paper "GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields"

MIT License

1.24k stars 160 forks source link

Hi,

Thanks for the awesome work! I'm curious, how does the trained model have the ability to be conditioned on controllable parameters, broken down as follows?

Shape and appearance latent code: is the Discriminator also be conditioned for this shape and appearance aspects? I cannot find it in the code. If it's indeed not being conditioned, then how does finally the model can associate that the latent code of shape is the variable to control the shape in the data generation? Likewise for the appearance latent code.
When sampling the transformation (s, R, T) and also camera pose per batch, do the corresponding real_data also have the similar properties of them (s, R, T and camera pose)? If not, again, how do the model can associate these controllable variables correctly? For example for an "unwanted case", when we sample T so that the generated object will be in the left, but the corresponding real_data when training the Discriminator has the object in the right.

Many thanks.

autonomousvision / giraffe

How the Model Can Achieve Ability to Conditionate on Controllable Params (shape, appearance, etc) #22