Open yuanxiaoyu1 opened 1 year ago
is it possible to customize tokenizer? it is very appreciated for any body to give me an example. I tried to debug the code, however, it's wrapped wrapped and wrapped.., can't find any code really called sentence piece tokenizer
the tokenizer tokenize text use regular expression, it seems tensorflow text have a regex_split, but not matched to t5x
is it possible to customize tokenizer? it is very appreciated for any body to give me an example. I tried to debug the code, however, it's wrapped wrapped and wrapped.., can't find any code really called sentence piece tokenizer