yuruntian / L2-Net

Code and mode for CVPR 2017 paper "L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space"
111 stars 36 forks source link

Questions about training process of loss function #6

Closed iCGY96 closed 6 years ago

iCGY96 commented 6 years ago

Hi @yuruntian! I'm trying to reproduce your results using tensorflow. I have some questions about training process of L2-Net.

  1. The author state in the paper that 2 is the maximum L2 distance between two unit vectors. Is there any normalization for the intermediate features or the output descriptor before calculating the loss function? I'm trying to reproduce your results using tensorflow. However, the loss value sometimes would be NaN without any normalization for the intermediate features or the output descriptor.

  2. Did you use any max pool layer on your L2-Net? The size of output descriptor is 128. However, the size of features maps after the final batch norm layer without any pool would be 8x8x128. How to transform these features map to 128 descriptors?

Thanks, Guangyao

xuelunshen commented 6 years ago
  1. Q: "The author state in the paper that 2 is the maximum L2 distance between two unit vectors." A: I think it should be minimum L2 distance between mathch/pair image patch , not maximum distance.

  2. Q: "Is there any normalization for the intermediate features or the output descriptor before calculating the loss function?" A: this is a good question that I want to figure out too.

  3. Q: "Did you use any max pool layer on your L2-Net?" A: No, no any pool is used there.

  4. Q: "The size of output descriptor is 128. However, the size of features maps after the final batch norm layer without any pool would be 8x8x128." A: maybe the last 8 8 convolution should be valid padding, it will make the output to be 1 1 * 128

Finally, I am a new reader for this paper, above things is my understanding, please based on answers from author.

yuruntian commented 6 years ago

@iCGY96 no normalization is applied to the intermediate features , but the final output descriptor is normalized to have unit norm.