python3.7/site-packages/transformers/tokenization_utils_base.py", line 2387, in _get_padding_truncation_strategies if padding_strategy != PaddingStrategy.DO_NOT_PAD and (not self.pad_token or self.pad_token_id < 0): TypeError: '<' not supported between instances of 'NoneType' and 'int'
when i debug the code, i find the variable "self.pad_token_id" is None, which leads to an error. But the variable self.pad_token is "<|endoftext|>", which is correct in GPT-2 style.
It seems like there is not a "<|endoftext|>" symbol in the vocab.json file. So i want to know that how the BiomedLM control the stopping of generation.
python3.7/site-packages/transformers/tokenization_utils_base.py", line 2387, in _get_padding_truncation_strategies if padding_strategy != PaddingStrategy.DO_NOT_PAD and (not self.pad_token or self.pad_token_id < 0): TypeError: '<' not supported between instances of 'NoneType' and 'int'
when i debug the code, i find the variable "
self.pad_token_id
" is None, which leads to an error. But the variableself.pad_token
is "<|endoftext|>
", which is correct in GPT-2 style. It seems like there is not a "<|endoftext|>
" symbol in thevocab.json
file. So i want to know that how the BiomedLM control the stopping of generation.