Open cixinlaoren opened 8 months ago
@cixinlaoren Have you solved this problem yet?
I found that you can retry the train process till it runs normal, this error is random, just need retry and it will be ok
I solved this error by croping input video to 512x512.
Sanity Val: 100%|██████████| 2/2 [00:01<00:00, 1.23step/s] | Validation results@200000: {'total_loss': 0.0761, 'mse_loss': 0.0004, 'lpips_loss': 0.0757} 201524step [01:23, 18.18step/s, ambient_loss=2.23e-5, head_psnr=40.4, lr_0=7.81e-5, mse_loss=9.07e-5, weights_entropy_loss=0.00803] Traceback (most recent call last): File "E:\deeplearning\talkall\GeneFace\tasks\run.py", line 20, in
run_task()
File "E:\deeplearning\talkall\GeneFace\tasks\run.py", line 14, in run_task
task_cls.start()
File "E:\deeplearning\talkall\GeneFace\utils\commons\base_task.py", line 251, in start
trainer.fit(cls)
File "E:\deeplearning\talkall\GeneFace\utils\commons\trainer.py", line 122, in fit
self.run_single_process(self.task)
File "E:\deeplearning\talkall\GeneFace\utils\commons\trainer.py", line 186, in run_single_process
self.train()
File "E:\deeplearning\talkall\GeneFace\utils\commons\trainer.py", line 286, in train
pbar_metrics, tb_metrics = self.run_training_batch(batch_idx, batch)
File "E:\deeplearning\talkall\GeneFace\utils\commons\trainer.py", line 333, in run_training_batch
output = task_ref.training_step(args)
File "E:\deeplearning\talkall\GeneFace\utils\commons\base_task.py", line 109, in training_step
loss_ret = self._training_step(sample, batch_idx, optimizer_idx)
File "E:\deeplearning\talkall\GeneFace\tasks\radnerfs\radnerf.py", line 194, in _training_step
loss_output, model_out = self.run_model(sample)
File "E:\deeplearning\talkall\GeneFace\tasks\radnerfs\radnerf.py", line 152, in run_model
losses_out['lpips_loss'] = self.criterion_lpips(pred_rgb, gt_rgb).mean()
File "D:\ProgramData\Anaconda3\envs\geneface\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "D:\ProgramData\Anaconda3\envs\geneface\lib\site-packages\lpips\lpips.py", line 119, in forward
outs0, outs1 = self.net.forward(in0_input), self.net.forward(in1_input)
File "D:\ProgramData\Anaconda3\envs\geneface\lib\site-packages\lpips\pretrained_networks.py", line 85, in forward
h = self.slice3(h)
File "D:\ProgramData\Anaconda3\envs\geneface\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, *kwargs)
File "D:\ProgramData\Anaconda3\envs\geneface\lib\site-packages\torch\nn\modules\container.py", line 217, in forward
input = module(input)
File "D:\ProgramData\Anaconda3\envs\geneface\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(args, kwargs)
File "D:\ProgramData\Anaconda3\envs\geneface\lib\site-packages\torch\nn\modules\pooling.py", line 166, in forward
return F.max_pool2d(input, self.kernel_size, self.stride,
File "D:\ProgramData\Anaconda3\envs\geneface\lib\site-packages\torch_jit_internal.py", line 484, in fn
return if_false(*args, **kwargs)
File "D:\ProgramData\Anaconda3\envs\geneface\lib\site-packages\torch\nn\functional.py", line 782, in _max_pool2d
return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
RuntimeError: Given input size: (192x2x2). Calculated output size: (192x0x0). Output size is too small