Closed RegaliaXYZ closed 2 years ago
As our understand, the model architecture of T5 and mT5 are fully same. So, the error here is only caused by name mapping. Because FasterTransformer is only a library, but not an inference engine or framework, it requires some manual work to support a model with different scope of name.
mT5 is supported in latest release. There are some small differences between mT5 and standard T5. So, FT v5.0 cannot support it directly.
Close this bug because it is inactivated. Feel free to re-open this issue if you still have any problem.
Currently mT5 is not fully supported, yielding multiple errors of type: [ERROR] cannot find key 'encoder.block.0.layer.1.DenseReluDense.wi_0.weight' when passing a mT5 model to the t5_ckpt_convert.py