microsoft / DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Apache License 2.0
1.76k stars 163 forks source link

Does deepspeed-mii support prefix_allowed_tokens_fn? #477

Open zcakzhuu opened 1 month ago

zcakzhuu commented 1 month ago

I use transformers pipeline to generate json dictionaries and I need to specify a prefix_allowed_tokens_fn such that the tokens that can be generated at some steps are fixed. By looking into the source code, it doesn't seem like the deepspeed pipeline support this. Could someone verify whether it is supported please?