[BUG]- state_dic saved by "accelerator" cannot be load due to "shared tensors" problem

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

MIT License

4.41k stars 373 forks source link

The "shared tensor" problem occures in "prior_encode.py" self.cfg = cfg self.enc_emb_tokens = nn.Embedding( cfg.vocab_size, cfg.encoder.encoder_hidden, padding_idx=0 ) self.enc_emb_tokens.weight.data.normal_(mean=0.0, std=1e-5) self.encoder = TransformerEncoder( enc_emb_tokens=self.enc_emb_tokens, cfg=cfg.encoder )

The accelerator would remove the shared tensor "enc_emb_tokens", and this would raise the Error when loading state

You can fix this bug by rewriting the code as following: `def enc_emb_tokens(cfg): enc_emb_tokens = nn.Embedding( cfg.vocab_size, cfg.encoder.encoder_hidden, padding_idx=0 ) enc_embtokens.weight.data.normal(mean=0.0, std=1e-5) return enc_emb_tokens

self.cfg = cfg self.encoder = TransformerEncoder( enc_emb_tokens(cfg), cfg=cfg.encoder )`

open-mmlab / Amphion

[BUG]- state_dic saved by "accelerator" cannot be load due to "shared tensors" problem #149