Open Aya-S opened 1 year ago
hi @Aya-S, you mean a method to go from a list of token IDs to a string? could you elaborate about the scenario where this could be useful?
Some tokens don't have an entry in the tokenizer vocabulary, so the process is not completely reversible.
A good use cade for a Decode()
method would be a TokenTextSplitter()
method. It seems to be reversible because other libraries have working decode methods such as: https://github.com/hyunwoongko/gpt2-tokenizer-java/blob/master/src/main/java/ai/tunib/tokenizer/GPT2Tokenizer.java.
Can't find documentation for decode method in the cs class.is decode not supported?