dmis-lab / BioSyn

ACL'2020: Biomedical Entity Representations with Synonym Marginalization
https://arxiv.org/abs/2005.00239
MIT License
160 stars 26 forks source link

what if there is no word in topk is correct? the loss could be infinite? #13

Closed flyangovoyang closed 2 years ago

flyangovoyang commented 2 years ago

Hello, there. After read the related paper, I got a question about the loss calculation. Formula 7 in the paper pointed out the definition of the marginal probability of the positive synonyms of a mention m. What if all of the top-k synonyms don't satisfy EQUAL(m, n) = 1 then the marginal probability could be zero. And in formula 8, log 0 could be infinite, which seems like problematic. Looking forward to your reply ~~

mjeensung commented 2 years ago

Hi @flyangovoyang

That's a really great point! For those cases, we filtered out samples with zero marginal probabilities. https://github.com/dmis-lab/BioSyn/blob/master/src/biosyn/rerankNet.py#L107

flyangovoyang commented 2 years ago

Oops, the question occurred to me since I didn't see any further explanation in the paper. Now that the code has covered this special case, everything is OK, thank you for your time~