The split_by_token_ordinary method and its corresponding iterator split_by_token_ordinary_iter have been added to the CoreBPE struct in vendor_tiktoken.rs. These methods allow for ordinary tokenization of a string without special tokens from the BPE model.
Simplified the .split_by_token_with_special_tokens method to just be split_by_token and differentiated between methods that return iter vs collection.
The
split_by_token_ordinary
method and its corresponding iteratorsplit_by_token_ordinary_iter
have been added to theCoreBPE
struct invendor_tiktoken.rs
. These methods allow for ordinary tokenization of a string without special tokens from the BPE model.Simplified the .split_by_token_with_special_tokens method to just be
split_by_token
and differentiated between methods that return iter vs collection.