DLR-RM / AugmentedAutoencoder

Official Code: Implicit 3D Orientation Learning for 6D Object Detection from RGB Images
MIT License
336 stars 97 forks source link

How to limit the SO(3) rotation in the training set? #117

Closed AminSeffo closed 1 year ago

AminSeffo commented 1 year ago

Hello @MartinSmeyer ,

For my use case, I don't really need all the rotations which can be generated by creating the codebook, and because I have problems with recognition and the rotations don't match (see the image below):

Screenshot from 2022-08-23 13-46-17 So the AAE does not match the alignment along the x-axis up to +180°. My idea is to constrain the alignments as follows:

X-axis and Y-axis = [-45,45] Z-axis = [-180,+180]

I have noticed that the training set contains too many images from this wrong position that could produce this prediction, so my question is how to limit the orientations in the config file?

Here some images from actual the training set:

Screenshot from 2022-08-23 13-42-23 Screenshot from 2022-08-23 13-42-26

This is how the object is oriented with Meshlab:

Screenshot from 2022-08-23 13-43-58

MartinSmeyer commented 1 year ago

Hi @AminSeffo,

there is no option for that in the config file. But you can simply adapt those two lines and recreate the codebook:

https://github.com/DLR-RM/AugmentedAutoencoder/blob/9f0a56f622fabf6200d9f034fcb2eef106997118/auto_pose/ae/dataset.py#L43-L44

AminSeffo commented 1 year ago

Hey @MartinSmeyer, thanks a lot for your response. I will try it out, I think the main problem is the rotational symmetry of the detected part..

AminSeffo commented 1 year ago

Hi @AminSeffo,

there is no option for that in the config file. But you can simply adapt those two lines and recreate the codebook:

https://github.com/DLR-RM/AugmentedAutoencoder/blob/9f0a56f622fabf6200d9f034fcb2eef106997118/auto_pose/ae/dataset.py#L43-L44

I could not realize any changes after changing these two lines to :

        azimuth_range = (0, 2 * np.pi)
        elev_range = (-0.5 * np.pi, 0)

After that I executed the following lines:

- pip install .
- ae_train exp_group/my_autoencoder
- ae_embed exp_group/my_autoencoder

training_images_499

The left view, the right view and the top view as can be seen below:

image(2)

MartinSmeyer commented 1 year ago

The visualization is for the training, but only the codebook/embedding is changed in these two lines. It doesn't hurt that the training still covers the whole pose space as long as the embedding only contains the correct poses. You can visualize the generated rotations using this function: https://github.com/DLR-RM/AugmentedAutoencoder/blob/9f0a56f622fabf6200d9f034fcb2eef106997118/auto_pose/ae/dataset.py#L177

If you still want to change it for training as well, you need to change adapt the code here: https://github.com/DLR-RM/AugmentedAutoencoder/blob/9f0a56f622fabf6200d9f034fcb2eef106997118/auto_pose/ae/dataset.py#L243

I don't bother much with specific poses, as this process needs to be repeated for every object and thus does not scale. But feel free to engineer it, it will probably give better results.

AminSeffo commented 1 year ago

@MartinSmeyer thank you a lot. I will try to test it just as you explained.