caskcsg / ir

ConTextual Mask Auto-Encoder for Dense Passage Retrieval
Apache License 2.0
35 stars 3 forks source link

About loss_ab and loss_ba in eq(8) #2

Open chenchongthu opened 1 year ago

chenchongthu commented 1 year ago

It seems that there is only loss_ab in the released code, right?

ma787639046 commented 1 year ago

Hi, sorry for the late reply. I didn't get a notification about this raised issue. As implemented in CotMAECollator.call of data.py, we unpacked a text span (anchor) and its contextual span together for forwarding the encoder model. They both produce MLM loss, so we get loss_ab + loss_ba together in one MLM loss of the encoder. https://github.com/caskcsg/ir/blob/ba950cabe3f5cead495f6e3bec119ce3f48b666f/cotmae/data.py#L186

Feel free to ask me if you have any further questions.