This is the preferred function over tokenizer.convert_ids_to_tokens() for user-facing data.
Quote from links:
Spaces are converted in a special character (the Ġ) in the tokenizer prior to
BPE splitting mostly to avoid digesting spaces since the standard BPE algorithm
used spaces in its process
Summary: Convert ids to tokens without ugly unicode characters (e.g., Ġ). See: https://github.com/huggingface/transformers/issues/4786 and https://discuss.huggingface.co/t/bpe-tokenizers-and-spaces-before-words/475/2
This is the preferred function over tokenizer.convert_ids_to_tokens() for user-facing data.
Quote from links:
Differential Revision: D62672912