05/14 04:33:48 AM gpu available: True, used: True
| model Arch: OfflineGaussianDiffusion
....
Traceback (most recent call last):
File "tasks/run.py", line 15, in <module>
run_task()
File "tasks/run.py", line 10, in run_task
task_cls.start()
File "/mnt/DiffSinger/tasks/base_task.py", line 258, in start
trainer.test(task)
File "/mnt/DiffSinger/utils/pl_utils.py", line 586, in test
self.fit(model)
File "/mnt/DiffSinger/utils/pl_utils.py", line 489, in fit
self.run_pretrain_routine(model)
File "/mnt/DiffSinger/utils/pl_utils.py", line 541, in run_pretrain_routine
self.restore_weights(model)
File "/mnt/DiffSinger/utils/pl_utils.py", line 617, in restore_weights
self.restore_state_if_checkpoint_exists(model)
File "/mnt/DiffSinger/utils/pl_utils.py", line 655, in restore_state_if_checkpoint_exists
self.restore(last_ckpt_path, self.on_gpu)
File "/mnt/DiffSinger/utils/pl_utils.py", line 668, in restore
model.load_state_dict(checkpoint['state_dict'], strict=False)
File "/root/miniconda3/envs/env/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1223, in load_state_dict
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for DiffSingerOfflineTask:
size mismatch for model.fs2.encoder_embed_tokens.weight: copying a param with shape torch.Size([62, 256]) from checkpoint, the shape in current model is torch.Size([57, 256]).
size mismatch for model.fs2.encoder.embed_tokens.weight: copying a param with shape torch.Size([62, 256]) from checkpoint, the shape in current model is torch.Size([57, 256]).
I got this runtime error when i tried
CUDA_VISIBLE_DEVICES=0 python tasks/run.py --config usr/configs/popcs_ds_beta6_offline.yaml --exp_name popcs_ds_beta6_offline_pmf0_1230 --reset --infer
exactly the same in README.
I change the version of scipy and torch a little bit, i dont now if it is the problem.
Basically I just followed the instruction:
I have put
I got this runtime error when i tried
CUDA_VISIBLE_DEVICES=0 python tasks/run.py --config usr/configs/popcs_ds_beta6_offline.yaml --exp_name popcs_ds_beta6_offline_pmf0_1230 --reset --infer
exactly the same in README. I change the version ofscipy
andtorch
a little bit, i dont now if it is the problem.Basically I just followed the instruction: I have put
in
checkpoints
and put data-example in/DiffSinger/data/processed/popcs/popcs-说散就散
, and preprocess it, gotmnt/DiffSinger/data/binary/popcs-pmf0
.I suspect I got something wrong or just the author unintentionally give out a ill-shaped pretrain model....?