Closed aretius closed 4 years ago
We could check whether the token id is correctly converted at https://github.com/microsoft/unilm/blob/b3b78ee8710060dcada404acd015a79fab8343cb/unilm-v1/src/biunilm/decode_seq2seq.py#L172
If the token id was correctly obtained in the previous step, its decoding score would be set to -10000
as in https://github.com/microsoft/unilm/blob/b3b78ee8710060dcada404acd015a79fab8343cb/unilm-v1/src/pytorch_pretrained_bert/modeling.py#L1539 , so that the blocked token won't appear in the top-k candidates.
@donglixp I will check whether the decoding id was correctly obtained or not. For the time being will close this issue, and if something comes up will re-open
Thanks!
Hey all, I am trying to use the
--not_predict_token
to remove a certain token from predictions which is the - [CLS] token. The issue is that while decoding on news texts it produces outputs like -Stocks in the news: RIL [CLS] [CLS][CLS][CLS][CLS][CLS][CLS][CLS][CLS][CLS][CLS][CLS][CLS]
and goes on. Of course, if I manually truncate the output from the CLS token it makes perfect sense. However, I opted to use the --not_predict_token despite which I see outputs like above. Any tips on how to improve such cases?