jasonwu0731 / ToD-BERT

Pre-Trained Models for ToD-BERT
BSD 2-Clause "Simplified" License
291 stars 55 forks source link

L2 normalization for hid_resp and hid_cont #18

Closed mudong0419 closed 3 years ago

mudong0419 commented 3 years ago

@jasonwu0731 Thanks for your excellent work. Calculate RCL loss scores = torch.matmul(hid_cont, hid_resp.transpose(1, 0)) loss_rs = xeloss(scores, resp_label) loss += loss_rs loss_rs = loss_rs.item()

when calculate rs loss, is it nessary to L2 normalize hid_resp hid_cont before matmul? So that the [CLS] vector be an embedding of a sentence.

jasonwu0731 commented 3 years ago

Hi,

I did not get what you are suggesting, can you point me to the code or equation? Thank you