ELS-RD / transformer-deploy

Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
https://els-rd.github.io/transformer-deploy/
Apache License 2.0
1.64k stars 150 forks source link

Support for constrained beam-search in T5 #158

Open junwang-wish opened 1 year ago

junwang-wish commented 1 year ago

HF T5 model (actually seq2seq model in general) supports complex decoding schemes such as constrained beam search https://huggingface.co/blog/constrained-beam-search. In my use case, I just really need the simplest constrained beam search where decoded sequences have to belong to a pre-defined Trie. This can be done via https://huggingface.co/docs/transformers/internal/generation_utils#transformers.PrefixConstrainedLogitsProcessor

Is this possible for transformer-deploy ?