Closed walkrunning closed 11 months ago
usually it means the model has decided you have reached [endoftext]
Most implementations with this model of encoding, decoding, and reverse prompt (myself included) are incorrect. Implementations of encoding will become almost correct with World's greedy tokenizer, but decoding and reverse prompt will still be buggy. I hope to document how to do this properly in the future
Most implementations with this model of encoding, decoding, and reverse prompt (myself included) are incorrect
Hm, why, given that official tokenizers
Python library is used? Or do you mean only these if
statements like some_byte_sequence in string
?
Most implementations with this model of encoding, decoding, and reverse prompt (myself included) are incorrect
Hm, why, given that official
tokenizers
Python library is used? Or do you mean only theseif
statements likesome_byte_sequence in string
?
implementers forget that sometimes encode(str1) + encode(str2) != encode (str1 + str2)
usually it means the model has decided you have reached [endoftext]
Haha, da Lao. I think you are right.
The novel model "+++" is not output after several times of writing, and other instructions are not output. I can only reset the model. decoded: str = tokenizer.decode(accumulated_tokens) if '\uFFFD' not in decoded: