DLR-RM / AugmentedAutoencoder

Official Code: Implicit 3D Orientation Learning for 6D Object Detection from RGB Images
MIT License
340 stars 97 forks source link

Getting bad rendered target image #58

Closed davideCremona closed 4 years ago

davideCremona commented 4 years ago

Hi, I'm experimenting with your repository using T-LESS object number 1.

I've created my custom cfg file as described in the readme and trained an AAE with "ae_train". When I explore the output checkpoint images produced by the script during training I'm seeing some strange glitches with the target images as shown here: training_images_9999.png: training_images_9999 training_images_19999.png: training_images_19999 training_images_29999.png: training_images_29999

I'm wondering if that ugly rendering is an OpenGL fault or because of some modifications I had to do to convert to Python3 but nothing special:

I'm currently working to figure out what is causing this problem, any idea is precious to me.. thank you in advance!

EDIT

My config file is the following:

[Paths]
MODEL_PATH: /home/DATI/insulators/models_cad/obj_01.ply
BACKGROUND_IMAGES_GLOB: /home/DATI/insulators/VOCdevkit/VOC2012/JPEGImages/*.jpg

[Dataset]
MODEL: cad
H: 128
W: 128
C: 3
RADIUS: 700
RENDER_DIMS: (720, 540)
K: [1075.65, 0, 720/2, 0, 1073.90, 540/2, 0, 0, 1]
# Scale vertices to mm
VERTEX_SCALE: 1
ANTIALIASING: 1
PAD_FACTOR: 1.2
CLIP_NEAR: 10
CLIP_FAR: 10000
NOOF_TRAINING_IMGS: 20000
NOOF_BG_IMGS: 15000

[Augmentation]
REALISTIC_OCCLUSION: False
SQUARE_OCCLUSION: False
MAX_REL_OFFSET: 0.20
CODE: Sequential([
    #Sometimes(0.5, PerspectiveTransform(0.05)),
    #Sometimes(0.5, CropAndPad(percent=(-0.05, 0.1))),
    Sometimes(0.5, Affine(scale=(1.0, 1.2))),
    Sometimes(0.5, CoarseDropout( p=0.2, size_percent=0.05) ),
    Sometimes(0.5, GaussianBlur(1.2*np.random.rand())),
    Sometimes(0.5, Add((-25, 25), per_channel=0.3)),
    Sometimes(0.3, Invert(0.2, per_channel=True)),
    Sometimes(0.5, Multiply((0.6, 1.4), per_channel=0.5)),
    Sometimes(0.5, Multiply((0.6, 1.4))),
    Sometimes(0.5, ContrastNormalization((0.5, 2.2), per_channel=0.3))
    ], random_order=False)

[Embedding]
EMBED_BB: True
MIN_N_VIEWS: 2562
NUM_CYCLO: 36

[Network]
BATCH_NORMALIZATION: False
AUXILIARY_MASK: False
VARIATIONAL: 0
LOSS: L2
BOOTSTRAP_RATIO: 4
NORM_REGULARIZE: 0
LATENT_SPACE_SIZE: 128
NUM_FILTER: [128, 256, 512, 512]
STRIDES: [2, 2, 2, 2]
KERNEL_SIZE_ENCODER: 5
KERNEL_SIZE_DECODER: 5

[Training]
OPTIMIZER: Adam
NUM_ITER: 30000
BATCH_SIZE: 64
LEARNING_RATE: 2e-4
SAVE_INTERVAL: 10000

[Queue]
# OPENGL_RENDER_QUEUE_SIZE: 500
NUM_THREADS: 10
QUEUE_SIZE: 50
MartinSmeyer commented 4 years ago

Thanks for the python3 fixes. Hmm, could you post your training config here? Changing line 70 in geometry.py should not be necessary.

davideCremona commented 4 years ago

Hi Martin, thank you for responding. I had to change that line in geometry.py because it gave me "OutOfBounds" when reading the list of vertices.

I think I have fixed this looking at the .ply that I was loading. The file is listing vertices and then faces, but the code in geometry.py was reading the vertices in order. For example, the original code is assuming that vertices are face-ordered like this: [vertex1_face1, vertex2_face1, vertex3_face1, vertex1_face2, vertex2_face2, vertex3_face2, ... ] But in a normal .ply file you find: [vertex,1, vertex2, vertex3,.... ] and then a description of how faces are build, referencing vertices with their index: [(0, 4, 100), (0, 3, 10), ...]

So you can immagine that reading a standard .ply file assuming that vertices are face-ordered will cause the sort of glitches that are shown in my previous post.

I have changed the way vertices are passed to "compute_normals" in geometry.py to have a list of face-ordered vertices, copying some vertices if necessary.

Just to be sure: are you pre-processing the .ply files with some code?

I have another issue to be honest: I have trained an AAE with correctly rendered images but now I am getting the same (wrong) prediction for different test poses: Test Image 1: snapshot00 predicted Rotation: [[ 1.0000000e+00 0.0000000e+00 0.0000000e+00] [ 0.0000000e+00 -1.0000000e+00 -1.2246468e-16] [ 0.0000000e+00 1.2246468e-16 -1.0000000e+00]] pred_view 1: pred_view_snapshot00

Test image 2: snapshot03

predicted Rotation (same as test image 1): [[ 1.0000000e+00 0.0000000e+00 0.0000000e+00] [ 0.0000000e+00 -1.0000000e+00 -1.2246468e-16] [ 0.0000000e+00 1.2246468e-16 -1.0000000e+00]]

pred_view 2: pred_view_snapshot03

I'm thinking that there can be some problem in the codebook, like KNN failure or something similar. Do you have some idea about this?

davideCremona commented 4 years ago

I've did some debugging and I don't know why but if I set a breakpoint at line 59 of file "codebook.py" and run debug, the "cosine_similarity" list is filled with zeros.

How the codebook is built? Where are the latent representations with the corresponding rotations memorized?

davideCremona commented 4 years ago

Ok, solved by executing "ae_embed.py" after the training. Now I'm getting correctly predicted rotations and pass to the next step of 6D pose estimation, yayy

FoxinSnow commented 4 years ago

@davideCremona Hi, I met the same issue with you caused by the .ply file vertices order. Would you mind sharing the method to changing the code or .ply file to get the correct result? Thank you very much!

davideCremona commented 4 years ago

Hi @FoxinSnow , I have added some lines of code that reads the .ply faces and duplicate vertices when needed. It's not optimized but it still works :)

In auto_pose/meshrenderer/gl_utils/geometry.py in function load_meshed(obj_files, vertex_tmp_store_folder, recalculate_normals=False) just after this line inside the for loop: mesh = scene.meshes[0] Add the following code:

vertices = []
for face in mesh.faces:
    vertices.extend([mesh.vertices[face[0]], mesh.vertices[face[1]], mesh.vertices[face[2]]])
vertices = np.array(vertices)

This should work (it worked for me). Let me know if something happens :)

FoxinSnow commented 4 years ago

@davideCremona Thank you very much for your sharing :)