lipku / metahuman-stream

Real time interactive streaming digital human
https://zhuanlan.zhihu.com/p/675131165
MIT License
954 stars 217 forks source link

换自己训练的模型运行就报错,大佬帮看看,感谢! #17

Closed stevin-dong closed 5 months ago

stevin-dong commented 5 months ago

换自己的训练模型时,声音我是用的Hubert, 启动app.py报错,大佬帮看看,感谢! trainer = Trainer('ngp', opt, model, device=device, workspace=opt.workspace, criterion=criterion, fp16=opt.fp16, metrics=metrics, use_checkpoint=opt.ckpt) File "/root/nerf/nerf_triplane/utils.py", line 724, in init self.load_checkpoint(self.use_checkpoint) File "/root/nerf/nerf_triplane/utils.py", line 1824, in load_checkpoint missing_keys, unexpected_keys = self.model.load_state_dict(checkpoint_dict['model'], strict=False) File "/root/miniconda3/envs/er/lib/python3.9/site-packages/torch/nn/modules/module.py", line 2041, in load_state_dict raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format( RuntimeError: Error(s) in loading state_dict for NeRFNetwork: size mismatch for individual_codes: copying a param with shape torch.Size([12000, 4]) from checkpoint, the shape in current model is torch.Size([10000, 4]). size mismatch for individual_codes_torso: copying a param with shape torch.Size([12000, 8]) from checkpoint, the shape in current model is torch.Size([10000, 8]).

lipku commented 5 months ago

训练时的音频要用wav2vec

stevin-dong commented 5 months ago

训练时的音频要用wav2vec

我们一直用的hubert,感觉这个口型较好,这个没法修改吗?大佬

lipku commented 5 months ago

那个暂时不能改成流式的

stevin-dong commented 5 months ago

训练时的音频要用wav2vec

训练时,换了wav2vec还是报以上同样的错误,我训练出来的模型是38.4M,我看作者大佬是的ngp_kf.pth文件大小是38M,哪里有问题呢

lipku commented 5 months ago

那可能是你用的float32训练的模型,把app.py里的opt.fp16 = True这行注释掉看看

ThetaRgo commented 4 months ago

一样的,用的hubert,大佬能支持下么?hubert口型好点。

lipku commented 4 months ago

只支持用wav2vec训练的模型

xiao-keeplearning commented 1 month ago

问下hubert是不支持流式么 @lipku

lipku commented 1 month ago

已经支持hubert了

lianping1985 commented 1 month ago

已经支持hubert了 这个需要修改哪里吗?我用hubert训练的还是报错