rapidsai / crossfit

Metric calculation library
Apache License 2.0
2 stars 6 forks source link

[FEA] Support for ctranslate2 model translation. #82

Closed uahmed93 closed 1 month ago

uahmed93 commented 1 month ago

Ctranslate2 's Translator api works with string tokens unlike other models which work on int token IDs. Current Tokenizer and Predictor operations run on int tokens. Simple exmaple of how this works is :

import ctranslate2
translator = ctranslate2.Translator("ende_ctranslate2/", device="cpu")
results = translator.translate_batch([["▁H", "ello", "▁world", "!"]])
print(results[0].hypotheses[0])
sarahyurick commented 1 month ago

See https://github.com/rapidsai/crossfit/pull/83.