RuntimeError: The size of tensor a (128) must match the size of tensor b (64) at non-singleton dimension 1
even though I am including the --cbase 32768 flag. the first line of output says "w_dim": 512, which I assume to be the cause of the size mismatch, but am unsure of how to fix it.
Creating output directory...
Launching processes...
Loading training set...
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:490: UserWarning: This DataLoader will create 3 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
cpuset_checked))
Constructing networks...
Resuming from "https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-t-ffhqu-256x256.pkl"
Traceback (most recent call last):
File "/content/drive/MyDrive/WIP/stylegan3/train.py", line 286, in
main() # pylint: disable=no-value-for-parameter
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 829, in call
return self.main(args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 610, in invoke
return callback(args, kwargs)
File "/content/drive/MyDrive/WIP/stylegan3/train.py", line 281, in main
launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
File "/content/drive/MyDrive/WIP/stylegan3/train.py", line 96, in launch_training
subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
File "/content/drive/MyDrive/WIP/stylegan3/train.py", line 47, in subprocess_fn
training_loop.training_loop(rank=rank, c)
File "/content/drive/MyDrive/WIP/stylegan3/training/training_loop.py", line 162, in training_loop
misc.copy_params_and_buffers(resume_data[name], module, require_all=False)
File "/content/drive/MyDrive/WIP/stylegan3/torch_utils/misc.py", line 162, in copy_params_andbuffers
tensor.copy(src_tensors[name].detach()).requiresgrad(tensor.requires_grad)
RuntimeError: The size of tensor a (128) must match the size of tensor b (64) at non-singleton dimension 1
Please copy&paste text instead of screenshots for better searchability.
**Expected behavior**
I would expect it to not throw an error.
**Screenshots**
If applicable, add screenshots to help explain your problem.
**Desktop (please complete the following information):**
- OS: Google colab
- PyTorch version (e.g., pytorch 1.9.0) : 1.11.0+cu113
- CUDA toolkit version (e.g., CUDA 11.4) see ^
- NVIDIA driver version : see
Mon Jul 11 05:50:49 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 |
| N/A 40C P8 9W / 70W | 0MiB / 15109MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
- GPU : see ^
**Additional context**
Any help is very much appreciated.
Describe the bug I'm receiving an error saying
even though I am including the
--cbase 32768
flag. the first line of output says"w_dim": 512,
which I assume to be the cause of the size mismatch, but am unsure of how to fix it.To Reproduce Steps to reproduce the behavior:
Training options: { "G_kwargs": { "class_name": "training.networks_stylegan3.Generator", "z_dim": 512, "w_dim": 512, "mapping_kwargs": { "num_layers": 2 }, "channel_base": 32768, "channel_max": 512, "magnitude_ema_beta": 0.9994456359721023 }, "D_kwargs": { "class_name": "training.networks_stylegan2.Discriminator", "block_kwargs": { "freeze_layers": 0 }, "mapping_kwargs": {}, "epilogue_kwargs": { "mbstd_group_size": 4 }, "channel_base": 32768, "channel_max": 512 }, "G_opt_kwargs": { "class_name": "torch.optim.Adam", "betas": [ 0, 0.99 ], "eps": 1e-08, "lr": 0.0025 }, "D_opt_kwargs": { "class_name": "torch.optim.Adam", "betas": [ 0, 0.99 ], "eps": 1e-08, "lr": 0.002 }, "loss_kwargs": { "class_name": "training.loss.StyleGAN2Loss", "r1_gamma": 50.0, "blur_init_sigma": 0 }, "data_loader_kwargs": { "pin_memory": true, "prefetch_factor": 2, "num_workers": 3 }, "training_set_kwargs": { "class_name": "training.dataset.ImageFolderDataset", "path": "/content/drive/MyDrive/WIP/stylegan3/datasets/artimages-256x256.zip", "use_labels": false, "max_size": 1341, "xflip": false, "resolution": 256, "random_seed": 0 }, "num_gpus": 1, "batch_size": 16, "batch_gpu": 8, "metrics": [ "fid50k_full" ], "total_kimg": 1, "kimg_per_tick": 4, "image_snapshot_ticks": 8, "network_snapshot_ticks": 8, "random_seed": 0, "ema_kimg": 5.0, "augment_kwargs": { "class_name": "training.augment.AugmentPipe", "xflip": 1, "rotate90": 1, "xint": 1, "scale": 1, "rotate": 1, "aniso": 1, "xfrac": 1, "brightness": 1, "contrast": 1, "lumaflip": 1, "hue": 1, "saturation": 1 }, "ada_target": 0.6, "resume_pkl": "https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-t-ffhqu-256x256.pkl", "ada_kimg": 100, "ema_rampup": null, "run_dir": "/content/drive/MyDrive/WIP/stylegan3/results/00025-stylegan3-t-artimages-256x256-gpus1-batch16-gamma50" }
Output directory: /content/drive/MyDrive/WIP/stylegan3/results/00025-stylegan3-t-artimages-256x256-gpus1-batch16-gamma50 Number of GPUs: 1 Batch size: 16 images Training duration: 1 kimg Dataset path: /content/drive/MyDrive/WIP/stylegan3/datasets/artimages-256x256.zip Dataset size: 1341 images Dataset resolution: 256 Dataset labels: False Dataset x-flips: False
Creating output directory... Launching processes... Loading training set... /usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:490: UserWarning: This DataLoader will create 3 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary. cpuset_checked))
Num images: 1341 Image shape: [3, 256, 256] Label shape: [0]
Constructing networks... Resuming from "https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-t-ffhqu-256x256.pkl" Traceback (most recent call last): File "/content/drive/MyDrive/WIP/stylegan3/train.py", line 286, in
main() # pylint: disable=no-value-for-parameter
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 829, in call
return self.main(args, kwargs)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 610, in invoke
return callback(args, kwargs)
File "/content/drive/MyDrive/WIP/stylegan3/train.py", line 281, in main
launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
File "/content/drive/MyDrive/WIP/stylegan3/train.py", line 96, in launch_training
subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
File "/content/drive/MyDrive/WIP/stylegan3/train.py", line 47, in subprocess_fn
training_loop.training_loop(rank=rank, c)
File "/content/drive/MyDrive/WIP/stylegan3/training/training_loop.py", line 162, in training_loop
misc.copy_params_and_buffers(resume_data[name], module, require_all=False)
File "/content/drive/MyDrive/WIP/stylegan3/torch_utils/misc.py", line 162, in copy_params_andbuffers
tensor.copy(src_tensors[name].detach()).requiresgrad(tensor.requires_grad)
RuntimeError: The size of tensor a (128) must match the size of tensor b (64) at non-singleton dimension 1
Mon Jul 11 05:50:49 2022
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 460.32.03 Driver Version: 460.32.03 CUDA Version: 11.2 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 Tesla T4 Off | 00000000:00:04.0 Off | 0 | | N/A 40C P8 9W / 70W | 0MiB / 15109MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+