DLR-RM / AugmentedAutoencoder

Official Code: Implicit 3D Orientation Learning for 6D Object Detection from RGB Images
MIT License
338 stars 97 forks source link

Strange training images generating when bgr_y random_lights set to True #65

Closed davideCremona closed 4 years ago

davideCremona commented 4 years ago

Hi, I'm trying to make some experiments using your code but I can't explain some strange things: I've changed a little bit the code of meshrenderer.py (render function) to have different phong parameters w.r.t. the 'random_lights' flag:

phong_x = {'ambient': 0.1, 'diffuse': 0.9, 'specular': 0.1}
phong = {'ambient': 0.4, 'diffuse': 0.8, 'specular': 0.3} # default one
if random_light:
    self.set_light_pose(1000. * np.random.random(3))
    self.set_ambient_light(phong_x['ambient'] + 0.1 * (2 * np.random.rand() - 1)
    self.set_diffuse_light(phong_x['diffuse'] + 0.1 * (2 * np.random.rand() - 1)
    self.set_specular_light(phong_x['specular'] + 0.1 * (2 * np.random.rand() - 1)
else:
    self.set_light_pose([400., 400., 400.])
    self.set_ambient_light(phong['ambient'])
    self.set_diffuse_light(phong['diffuse'])
    self.set_specular_light(phong['specular'])

Now, in dataset.py, in render_training_images() you render bgr_x and bgr_y (input of encoder-decoder network and target). The default parameters include "random_light=False" for bgr_y and "random_lights=True" for bgr_x to better generalize lights.

My experiment is to change the rendering call of bgr_y to have "random_lights=True", but when I start training the model I get these training images: training_images_29999

Have you got some ideas to why the model is not predicting anything? The only change to the code is that flag set to True instead of False that causes different shader parameters to be considered, does this cause this problem?

I have also another question: what are the camera parameters? I need them to set a Blender camera like your OpenGL one.

Thank you in advance :)

MartinSmeyer commented 4 years ago

Hey, bgr_y should stay at True because the input images should be mapped to a canonical lighting for robustness against different lighting conditions. I think you get black reconstructions because your objects are too thin and most of the target is black. You can fix it, I quote the README:

"Middle part should show reconstructions of the input object (if all black, set higher bootstrap_ratio / auxilliary_mask in training config)"

Camera parameters are defined in the training config, you can set them arbitrarily. :)