Closed SinanAkkoyun closed 6 months ago
I find this a little strange. Does this still happen on the latest version? The relevant code is:
try:
id_to_ord = self.tokenizer.get_id_to_ord_list()
b = [id_to_ord[x] for x in self.held_utf8_tokens[0].tolist()]
c = bytes(b).decode('utf-8')
except ValueError:
id_to_piece = self.tokenizer.get_id_to_piece_list()
c = "".join(id_to_piece[x] for x in self.held_utf8_tokens[0].tolist())
except UnicodeDecodeError:
c = "�"
So it shouldn't be able to throw a ValueError here.
I'll assume this was a version mismatch. Feel free to reopen if not.