microsoft / CodeXGLUE

CodeXGLUE
MIT License
1.51k stars 363 forks source link

The loss function of code search #57

Closed hbenyamina closed 3 years ago

hbenyamina commented 3 years ago

I am trying to write about the code search task. But I did not understand the following part:

         scores=(nl_vec[:,None,:]*code_vec[None,:,:]).sum(-1)  
         loss_fct = CrossEntropyLoss()  
         loss = loss_fct(scores, torch.arange(bs, device=scores.device))  

In: https://github.com/microsoft/CodeXGLUE/blob/a0e2febebf20d551d479df40a51508b7797fea91/Text-Code/NL-code-search-Adv/code/model.py#L31

Can you please explain what type of similarity is this? I this cosine similarity? And why are you comparing the result with a ranking?

Thanks is advance

guody5 commented 3 years ago

We use dot product of nl and code vectors instead of cosine similarity as the score. Please refer to section 4.1 in https://arxiv.org/pdf/1909.09436.pdf

hbenyamina commented 3 years ago

Oh great. Thank you.