rpautrat / SuperPoint

Efficient neural feature detector and descriptor
MIT License
1.89k stars 417 forks source link

Descriptor training #237

Open jeethesh-pai opened 2 years ago

jeethesh-pai commented 2 years ago

Hallo @rpautrat,

I am very grateful for your repository. It helped me a lot to understand the implementation of the paper Superpoint. I am currently doing a student project on the performance of superpoint on laser-scanned images. Since the tensorflow version is not compatible with the server I am deploying, which I have limited permissions I had to adapt the code for the newer version. I followed your style of code and was successful till training magicpoint for 2 Homographic adaptations. But the descriptor training is not happening. The validation loss diverges from the starting of iteration. I have some questions regarding descriptor training and since you were succeded in training it, i think only you can help me with this doubt.

  1. Descriptor loss says that the norm coord_cells - warped_coord_cells should be less than 8. Supposing we use a homography with just translation t_x to right side with 10 pixels. Applying this transformation will shift the warped_coord_cells to 10 pixel right of original coords . That means in the correspondence matrix s(hwh'w'), the first cell is not matching with the first cell of warped_coord_cell whereas it would match with second cell of warped_coord_cellis that right?
  2. If I am correct in the above case, during rotation only case, we have only correspondence, only in the centre part of the image because the outer cells of image after homography multiplication are always greater than 8 pixels to the original correspondence cells. What should be the ideal rotation in-order to prevent this?
  3. As I went through the closed issues regarding descriptor loss. I saw that some have mentioned that descriptor loss was diverging until they removed the normalization. Since you also implemented in a similar way initially and have better results than with the normalization. Do you think removing just the normalization of desc_product will help training better or do you think removing the both normalization of last layer of descriptor convolution and normalization of desc_product helps training better.

Training info: I used learning rate of 0.01 as well as 0.001 with Adam and homographic adaptation config

        homographic:
            enable: true
            num: 1 # 100
            aggregation: 'sum'
            filter_counts: 0
            homographies:
                params:
                    translation: true
                    rotation: true
                    scaling: false
                    perspective: false
                    scaling_amplitude: 0.5
                    perspective_amplitude_x: 0.7
                    perspective_amplitude_y: 0.7
                    allow_artifacts: true
                    patch_ratio: 0.85
                    max_angle: 0.785
        valid_border_margin: 3

for the descriptor training.

Thank you for the help

litingsjj commented 2 years ago

same question! Have you sloved?

jeethesh-pai commented 2 years ago

No, It needs a good weighting factor for combined loss. The weighting factor given in this repo and mentioned in the paper is not working for me. For the both factors my loss is diverging.