YuvalNirkin / fsgan

FSGAN - Official PyTorch Implementation
https://nirkin.com/fsgan
Creative Commons Zero v1.0 Universal
754 stars 147 forks source link

RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [1, 1024, 32, 32] #162

Closed nvrmnd-gh closed 1 year ago

nvrmnd-gh commented 2 years ago

Running the default inference parameters on the v1 branch appears to lead to a dimension error in the UnetUp class here: https://github.com/YuvalNirkin/fsgan/blob/v1/models/simple_unet.py#L135

python face_swap_video2video.py ../docs/examples/shinzo_abe.mp4 -t ../docs/examples/conan_obrien.mp4 -o output

results in: RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [1, 1024, 32, 32]

I had made some minor changes to use CPU only while getting this to work on my M1 before moving to a host w/ a Nvidia GPU, nothing that would change dimensions here. Still, since I don't see this issue posted already I assume it's something local to my machine/arguments, appreciate any pointers here.

prodigy-sub commented 2 years ago

same issue... don't know why...

YuvalNirkin commented 2 years ago

What is your PyTorch version?

prodigy-sub commented 2 years ago

What is your PyTorch version?

@YuvalNirkin my version is "1.11.0+cu113" I'm running the code in colab

prodigy-sub commented 2 years ago

with the sample source and target, I'm getting a error as following

/content/projects/fsgan/models/simple_unet_02.py in forward(self, inputs1, inputs2) 133 134 def forward(self, inputs1, inputs2): --> 135 outputs2 = self.up(inputs2) 136 outputs2 = self.conv1d(outputs2,) 137 offset = outputs2.size()[2] - inputs1.size()[2] ... RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [24, 1024, 32, 32]

nvrmnd-gh commented 2 years ago

My PyTorch version is 1.11.0

@prodigy-sub -- you're getting that dimension error (different first dimension) with the same input/target videos I posted originally?

prodigy-sub commented 2 years ago

Yes. Correct. I'm using Conan and Abe video It fails at the segmentation step

prodigy-sub commented 2 years ago

@YuvalNirkin

the following is the full error message, I think there is some problem with the segmentation model. but Since I'm trying to used the pre-trained weight that you guys offered, I think it is not an option to change the layer of the model... do you have any solutions?

100%|██████████| 600/600 [03:35<00:00, 2.78frames/s] => Extracting sequences from detections in video: "source.mp4"... 100%|██████████| 601/601 [00:00<00:00, 11066.43it/s] => Cropping video sequences from video: "source.mp4"... 100%|██████████| 600/600 [00:04<00:00, 148.46it/s] => Computing face poses for video: "source_seq00.mp4"... 100%|██████████| 5/5 [00:03<00:00, 1.53batches/s] => Computing face landmarks for video: "source_seq00.mp4"... 100%|██████████| 10/10 [00:03<00:00, 2.72batches/s] => Computing face segmentation for video: "source_seq00.mp4"... 0%| | 0/25 [00:00<?, ?batches/s]

RuntimeError Traceback (most recent call last) in () 16 17 face_swapping(source_path, target_path, output_tmp_path, ---> 18 select_source, select_target, finetune) 19 20 # Encode with audio and display result

9 frames /content/projects/fsgan/inference/swap.py in call(self, source_path, target_path, output_path, select_source, select_target, finetune) 237 238 # Cache input --> 239 source_cache_dir, source_seq_filepath, = self.cache(source_path) 240 target_cache_dir, target_seq_filepath, = self.cache(target_path) 241

/content/projects/fsgan/preprocess/preprocess_video.py in cache(self, input_path, output_dir) 478 479 # Cache segmentation --> 480 self.process_segmentation(input_path, output_dir, seq_file_path) 481 482 return output_dir, seq_file_path, pose_file_path if self.cache_pose and is_vid else None

/content/projects/fsgan/preprocess/preprocess_video.py in process_segmentation(self, input_path, output_dir, seq_file_path) 382 383 # Compute segmentation --> 384 raw_segmentation = self.S(frame) 385 segmentation = torch.cat((prev_segmentation, raw_segmentation), dim=0) \ 386 if prev_segmentation is not None else raw_segmentation

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, *kwargs) 1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1109 or _global_forward_hooks or _global_forward_pre_hooks): -> 1110 return forward_call(input, **kwargs) 1111 # Do not call functions when jit is used 1112 full_backward_hooks, non_full_backward_hooks = [], []

/content/projects/fsgan/models/simple_unet_02.py in forward(self, inputs) 69 70 center = self.center(maxpool4) ---> 71 up4 = self.up_concat4(conv4, center) 72 up3 = self.up_concat3(conv3, up4) 73 up2 = self.up_concat2(conv2, up3)

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, *kwargs) 1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1109 or _global_forward_hooks or _global_forward_pre_hooks): -> 1110 return forward_call(input, **kwargs) 1111 # Do not call functions when jit is used 1112 full_backward_hooks, non_full_backward_hooks = [], []

/content/projects/fsgan/models/simple_unet_02.py in forward(self, inputs1, inputs2) 133 def forward(self, inputs1, inputs2): 134 outputs2 = self.up(inputs2) --> 135 outputs2 = self.conv1d(outputs2,) 136 offset = outputs2.size()[2] - inputs1.size()[2] 137 padding = 2 * [offset // 2, offset // 2]

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, *kwargs) 1108 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks 1109 or _global_forward_hooks or _global_forward_pre_hooks): -> 1110 return forward_call(input, **kwargs) 1111 # Do not call functions when jit is used 1112 full_backward_hooks, non_full_backward_hooks = [], []

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py in forward(self, input) 300 301 def forward(self, input: Tensor) -> Tensor: --> 302 return self._conv_forward(input, self.weight, self.bias) 303 304

/usr/local/lib/python3.7/dist-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight, bias) 297 _single(0), self.dilation, self.groups) 298 return F.conv1d(input, weight, bias, self.stride, --> 299 self.padding, self.dilation, self.groups) 300 301 def forward(self, input: Tensor) -> Tensor:

RuntimeError: Expected 2D (unbatched) or 3D (batched) input to conv1d, but got input of size: [24, 1024, 32, 32]

3vaD3 commented 2 years ago

SOLVED partially: I had the same issue [55 1024 32 32]... for me at least it was pytorch incompatibility doesn't like pytorch 1.11.0 nor 1.0.1...

I commented out the 5 lines regarding install dependencies, Anna, conda, pip3... etc and replaced it with what's below....

Install: PyTorch (we assume 1.5.1 but VISSL works with all PyTorch versions >=1.4)

!pip install torch==1.5.1+cu101 torchvision==0.6.1+cu101 -f https://download.pytorch.org/whl/torch_stable.html

install opencv

!pip install opencv-python

install apex by checking system settings: cuda version, pytorch version, python version

import sys import torch version_str="".join([ f"py3{sys.version_info.minor}_cu", torch.version.cuda.replace(".",""), f"_pyt{torch.version[0:5:2]}" ]) print(version_str)

install apex (pre-compiled with optimizer C++ extensions and CUDA kernels)

!pip install apex -f https://dl.fbaipublicfiles.com/vissl/packaging/apexwheels/{version_str}/download.html

install VISSL

!pip install vissl

NEW ISSUE.... Gpu RAM insufficient! On target.mp4 segmentation

zero-nnkn commented 2 years ago

You can try this issue's solution: #161

YuvalNirkin commented 1 year ago

Thank you. This is issue should be fixed now. Follow the new installation instructions.