microsoft / DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
Apache License 2.0
1.89k stars 175 forks source link

Support for FLAN-T5 #106

Open jihan-yin opened 1 year ago

jihan-yin commented 1 year ago

I saw that T5 wasn't in the list of supported huggingface transformers models. Are there plans / ETA for when the T5 family would be added? FLAN-T5 is a very strong llm for zero/fewshot instruction prompting. I am currently building out a hacky implementation for hosting with deepspeed-inference, but having it natively supported in deepspeed-mii would be ideal.

mrwyattii commented 1 year ago

We do support the T5 family with DeepSpeed-Inference with a custom injection policy (see this DeepSpeed unit test). However, we have not yet brought this support into MII. It's on our radar to add this in the future. We are also open to outside contributions if you would like to submit a PR!

jeffra commented 1 year ago

Also keep an eye on this PR, it’s currently a work in progress for better T5 support: https://github.com/microsoft/DeepSpeed/pull/2451

mhillebrand commented 9 months ago

Assuming that PR does get merged, would it also support Long T5?