DLR-RM / AugmentedAutoencoder

Official Code: Implicit 3D Orientation Learning for 6D Object Detection from RGB Images
MIT License
339 stars 97 forks source link

About the code details #27

Closed backkon closed 5 years ago

backkon commented 5 years ago

Hi! I want to konw the meaning of train_x,mask_x,noof_obj_pixels,train_y,and hashlib.md5(...).hexdigest(). Looking forward to your explanation!

MartinSmeyer commented 5 years ago

Hi! train_x are the rendered object views for training the AAE, recorded at different lighting conditions and translated randomly. (The other augmentations are applied online during training)

mask_x defines the masks of train_x, such that at training time you can replace the background with images.

train_y is the reconstruction target, i.e. centered object views at constant lighting.

noof_obj_pixels is the number of visible object pixels before/after applying occlusions on the input. It can be used to constrain the applied occlusions on train_x in order to not occlude to much of the object.

Since I do not want to regenerate the synthetic training data for every training run, I create a hash out of the configuration parameters that influence the rendering process. The resulting images are cached as \<hash>.npy in the $AEWS/tmp_dataset folder.