Saving Distillation model

Hey @PranjalChitale,

I was trying to save the distilled model using the given script: convert_indictrans_checkpoint_to_pytorch.py.

Because we are using the shared tensor for lm_head.weight and model.decoder.embed_tokens.weight I am facing the following issue.

File "/workspace/research/IndicTrans2/huggingface_interface/convert_indictrans_checkpoint_to_pytorch.py", line 107, in <module>
    model.save_pretrained(args.pytorch_dump_folder_path)
  File "/opt/conda/envs/itv2/lib/python3.9/site-packages/transformers/modeling_utils.py", line 2546, in save_pretrained
    raise RuntimeError(
RuntimeError: The weights trying to be saved contained shared tensors [{'lm_head.weight', 'model.decoder.embed_tokens.weight'}] that are mismatching the transformers base configuration. Try saving using `safe_serialization=False` or remove this tensor sharing.

For distilled models are you using safe_serialization=False or is it something else? Thanks!

AI4Bharat / IndicTrans2

Saving Distillation model #77