triton-inference-server / fastertransformer_backend

BSD 3-Clause "New" or "Revised" License
411 stars 134 forks source link

Allow mT5 support alongside T5 #21

Closed RegaliaXYZ closed 2 years ago

RegaliaXYZ commented 2 years ago

Currently mT5 is not fully supported, yielding multiple errors of type: [ERROR] cannot find key 'encoder.block.0.layer.1.DenseReluDense.wi_0.weight' when passing a mT5 model to the t5_ckpt_convert.py

byshiue commented 2 years ago

As our understand, the model architecture of T5 and mT5 are fully same. So, the error here is only caused by name mapping. Because FasterTransformer is only a library, but not an inference engine or framework, it requires some manual work to support a model with different scope of name.

byshiue commented 2 years ago

mT5 is supported in latest release. There are some small differences between mT5 and standard T5. So, FT v5.0 cannot support it directly.

byshiue commented 2 years ago

Close this bug because it is inactivated. Feel free to re-open this issue if you still have any problem.