alexa / dialoglue

DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriented Dialogue
https://evalai.cloudcv.org/web/challenges/challenge-page/708/overview
Apache License 2.0
279 stars 25 forks source link

Fix: ignore tokens should be set to -100, not -1 #19

Open maxlchen opened 2 years ago

maxlchen commented 2 years ago

Issue #, if available:

Description of changes: Correcting the value of unmasked indices in the target tensor in mask_tokens(). Currently, line 348 is: labels[~masked_indices] = -1 But, the value of -1 breaks the loss function. From BertForMaskedLM documentation: "Indices should be in [-100, 0, ..., config.vocab_size] ... Tokens with indices set to -100 are ignored (masked), the loss is only computed for the tokens with labels in [0, ..., config.vocab_size]" This value constraint appears to hold in every version of the HuggingFace API which has documentation for BertForMaskedLM.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.