Closed wangyu1997 closed 8 months ago
This would be a breaking change to the API. And quite a few use-cases rely on controlling when added tokens are encoded or decoded. E.g. not encoding special tokens removes the need to sanitize user input, and it's necessary if you want to be able to encode something like </s>
as text rather than as a control symbol.
remove "encode/decode_special_tokens" parameters in
encode/decode
method, add globalenable_special_tokens
option and turn on depend on the existence ofadd_tokens.json