Open 980202006 opened 5 months ago
If I use bpe, split_by_num will truncate the id value regardless of whether split_by_whitespace is selected or not. print(sp.id_to_piece(111)) #65, 26
@azimjonn Could you give detailed configuration? The URL you gave is the default configuration.
I have some id values and I want to train them with bpe.The following is an example of the id value.
I want to extract the class [26865, 26865, ] as a vocabulary.