Closed wwt17 closed 1 year ago
Name | Link |
---|---|
Latest commit | 5a6c9162c5a254931186de54b1b4c6b00726a6c0 |
Latest deploy log | https://app.netlify.com/sites/pyminicons/deploys/64f799d53b41370008add4a5 |
Name | Link |
---|---|
Latest commit | 5a6c9162c5a254931186de54b1b4c6b00726a6c0 |
Latest deploy log | https://app.netlify.com/sites/pyminicons/deploys/64f799d53b41370008add4a5 |
makes sense -- yeah I should have taken care of the redundant code there lol -- thanks for the PR!
the bug occurs when the sequence does not end with eos token so that the logit in the last token in the vocab is removed.
The problem is,
logit.squeeze(0)
has only two dimensions and of size(max_length, vocab_size)
. Sologit.squeeze(0)[:, :-1]
do not remove the logits on the last token but just shrink the vocabulary by removing the last token in the vocabulary. This is definitely unintended. Also, it doesn't matter ifidx[-1] == self.tokenizer.eos_token_id
(and the last token is removed) here, because[torch.arange(offset, length),]
will have the same effect of subscription.So the solution is to simply keep only the first clause.