Open HennerM opened 1 year ago
It would be great to support all the models ctranslate2 supports now (like Whisper and Encoder only).
I considered the best way to abstract the inference since different model classes have other calling functions (like generate, translate ...). We could create a metaclass which takes over the handling of the inference and initialization, so we have a unified interface to talk with. Tbh, I need to find out how easy this is.
Let's specify the model type in the config.
Previously reported in https://github.com/speechmatics/ctranslate2_triton_backend/issues/2#issuecomment-1546889761 by @aamir-s18
The backend currently only supports encoder-decoder models, whereas the underlying library also has support for decoder-only models: https://github.com/OpenNMT/CTranslate2/blob/master/src/models/language_model.cc
This should be fairly straightforward to add. Ideally we want to auto-detect the type of the model, or alternatively specify in the configuration.