google-research / multilingual-t5

Apache License 2.0
1.24k stars 128 forks source link

Export/Save mt5 model #21

Closed ashissamal closed 3 years ago

ashissamal commented 3 years ago
!t5_mesh_transformer \
  --model_dir="gs://t5-data/pretrained_models/mt5/base" \
  --use_model_api \
  --mode="export_predict" \
  --export_dir="{saved_model_dir}"

saved_model_path = os.path.join(saved_model_dir, max(os.listdir(saved_model_dir)))

While running the above code, it  loads t5-base, resulting the below error due to the shape mismatch. How to export a mt5 model.

tensorflow.python.framework.errors_impl.InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [32128,768] rhs shape= [250112,768]
     [[{{node save/Assign_281}}]]  
irolnick commented 3 years ago

This looks like the result of a vocab mismatch. Try adding --module_import="multilingual_t5.tasks" so the export code would have a chance to grab the multilingual vocabulary.

ashissamal commented 3 years ago

Yes, tried with --module_import="multilingual_t5.tasks, getting below error on t5==0.7.1 ModuleNotFoundError: No module named 'multilingual_t5'

adarob commented 3 years ago

You need to first clone this repo and call the launch script from within the main repo directory.