Open PengchengWang opened 5 years ago
I wish the author could upload trained model to convince us...
Hi @PengchengWang , your observation is correct. I've experienced similar behavior when I trained on rooms ring. The model learns to render the quadratic rooms (i.e. floor, background and sky) relatively well from the correct angle. However, the rendered rooms are mostly empty until very late in the training process (>1.5 iterations) and even then the rendered objects are just blurry glimpses. My hunch is, that a careful learning rate schedule needs to be implemented in order to get the object details right. It is also very plausible that the network learns to render empty rooms first since these parts constitute the lion's share of the images and therefore dominate the L2 reconstruction term. However, the loss plateauing around 6.9k is expected. At some point the reconstruction term can at best learn the perfect means of the target images and the only thing which is optimized afterwards is the KL divergence (which should still continue to decrease). @wenbingz The only dataset in which I managed to do the view interpolation magic reliably are the Shepard Metzler datasets. I'm trying to upload a good snapshot together with a run script.
I've trained the model on rooms_ring_camera twice, everytime had a loss stoping around 7000. So I ran view_interpolation.ipynb and get this stuff of things..