Open vturrisi opened 1 year ago
@vturrisi I'll get to this as soon as I manage, what is the skip_special_tokens
arg meant to do?
No worries @gpucce. It basically removes the sos and eos tokens and padding from the decoded string. https://huggingface.co/docs/transformers/main_classes/tokenizer
Right now the tokenizer decode method supports only a single instance at a time. I think it would be good to have
batch_decode
function and also supportskip_special_tokens
andclean_up_tokenization_spaces
as in huggingface.