triton-inference-server / fastertransformer_backend

BSD 3-Clause "New" or "Revised" License
411 stars 133 forks source link

Support BLOOM model? #69

Closed pai4451 closed 1 year ago

pai4451 commented 1 year ago

Hi triton developer,

Currently, the fastertransformer backend only supports GPT, GPT-J, T5, GPT-NeoX and Bert.

I am wondering is there any plan to support bigscience/bloom model? If not, is it possible (and how) to convert BLOOM to one of the supported models and serve by triton with fastertransformer backend?

byshiue commented 1 year ago

We have plan to support bloom. You cannot use gpt backend to run bloom now because their model architectures are little different.

pai4451 commented 1 year ago

We have plan to support bloom. You cannot use gpt backend to run bloom now because their model architectures are little different.

Thanks @byshiue, I’m glad that BLOOM will be supporting. Will this comes in a near future?

byshiue commented 1 year ago

This is supported in latest v1.3 release. Thank you for the suggestion.

pai4451 commented 1 year ago

@byshiue Thanks for supporting BLOOM. I checked the ft backend docs on BLOOM there should be a test code tools/gpt/bloom_test.py for BLOOM but somehow I couldn't find it in the repo. Also, can you share the parameter settings config.ini for BLOOM-176b as well?