Closed uahmed93 closed 1 month ago
Ctranslate2 's Translator api works with string tokens unlike other models which work on int token IDs. Current Tokenizer and Predictor operations run on int tokens. Simple exmaple of how this works is :
import ctranslate2 translator = ctranslate2.Translator("ende_ctranslate2/", device="cpu") results = translator.translate_batch([["▁H", "ello", "▁world", "!"]]) print(results[0].hypotheses[0])
See https://github.com/rapidsai/crossfit/pull/83.
Ctranslate2 's Translator api works with string tokens unlike other models which work on int token IDs. Current Tokenizer and Predictor operations run on int tokens. Simple exmaple of how this works is :