NVlabs / stylegan3

Official PyTorch implementation of StyleGAN3
Other
6.38k stars 1.12k forks source link

About training grayscale images #635

Open rememberBr opened 8 months ago

rememberBr commented 8 months ago

Describe the bug RuntimeError: output with shape [64, 1, 1, 1] doesn't match the broadcast shape [64, 3, 1, 1]

To Reproduce Steps to reproduce the behavior:

  1. In 'stylegan3' directory, run command 'python train.py --resume=https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-r-ffhqu-256x256.pkl ....'
  2. See error Training options: { "G_kwargs": { "class_name": "training.networks_stylegan3.Generator", "z_dim": 512, "w_dim": 512, "mapping_kwargs": { "num_layers": 2 }, "channel_base": 32768, "channel_max": 1024, "magnitude_ema_beta": 0.9977843871238888, "conv_kernel": 1, "use_radial_filters": true }, "D_kwargs": { "class_name": "training.networks_stylegan2.Discriminator", "block_kwargs": { "freeze_layers": 0 }, "mapping_kwargs": {}, "epilogue_kwargs": { "mbstd_group_size": 4 }, "channel_base": 16384, "channel_max": 512 }, "G_opt_kwargs": { "class_name": "torch.optim.Adam", "betas": [ 0, 0.99 ], "eps": 1e-08, "lr": 0.0025 }, "D_opt_kwargs": { "class_name": "torch.optim.Adam", "betas": [ 0, 0.99 ], "eps": 1e-08, "lr": 0.002 }, "loss_kwargs": { "class_name": "training.loss.StyleGAN2Loss", "r1_gamma": 2.0, "blur_init_sigma": 0, "blur_fade_kimg": 400.0 }, "data_loader_kwargs": { "pin_memory": true, "prefetch_factor": 2, "num_workers": 3 }, "training_set_kwargs": { "class_name": "training.dataset.ImageFolderDataset", "path": "../Gan/data/data/256L", "use_labels": false, "max_size": 4813, "xflip": true, "resolution": 256, "random_seed": 0 }, "num_gpus": 1, "batch_size": 64, "batch_gpu": 64, "metrics": [], "total_kimg": 25000, "kimg_per_tick": 4, "image_snapshot_ticks": 5, "network_snapshot_ticks": 5, "random_seed": 0, "ema_kimg": 20.0, "augment_kwargs": { "class_name": "training.augment.AugmentPipe", "xflip": 1, "rotate90": 1, "xint": 1, "scale": 1, "rotate": 1, "aniso": 1, "xfrac": 1, "brightness": 1, "contrast": 1, "lumaflip": 1, "hue": 1, "saturation": 1 }, "ada_target": 0.6, "resume_pkl": "https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-r-ffhqu-256x256.pkl", "ada_kimg": 100, "ema_rampup": null, "run_dir": "~/training-runs-rr/00000-stylegan3-r-256L-gpus1-batch64-gamma2" }

Output directory: ~/training-runs-rr/00000-stylegan3-r-256L-gpus1-batch64-gamma2 Number of GPUs: 1 Batch size: 64 images Training duration: 25000 kimg Dataset path: ../Gan/data/data/256L Dataset size: 4813 images Dataset resolution: 256 Dataset labels: False Dataset x-flips: True

Creating output directory... Launching processes... Loading training set...

Num images: 9626 Image shape: [1, 256, 256] Label shape: [0]

Constructing networks... Resuming from "https://api.ngc.nvidia.com/v2/models/nvidia/research/stylegan3/versions/1/files/stylegan3-r-ffhqu-256x256.pkl" Traceback (most recent call last): File "/home/bairu/workspace/DatasetGEN/styleGan3/train.py", line 288, in main() # pylint: disable=no-value-for-parameter File "/home/bairu/miniconda3/envs/stylegan3/lib/python3.9/site-packages/click/core.py", line 1157, in call return self.main(args, kwargs) File "/home/bairu/miniconda3/envs/stylegan3/lib/python3.9/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/home/bairu/miniconda3/envs/stylegan3/lib/python3.9/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, ctx.params) File "/home/bairu/miniconda3/envs/stylegan3/lib/python3.9/site-packages/click/core.py", line 783, in invoke return __callback(args, kwargs) File "/home/bairu/workspace/DatasetGEN/styleGan3/train.py", line 283, in main launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run) File "/home/bairu/workspace/DatasetGEN/styleGan3/train.py", line 98, in launch_training subprocess_fn(rank=0, c=c, temp_dir=temp_dir) File "/home/bairu/workspace/DatasetGEN/styleGan3/train.py", line 49, in subprocess_fn training_loop.training_loop(rank=rank, c) File "/home/bairu/workspace/DatasetGEN/styleGan3/training/training_loop.py", line 164, in training_loop misc.copy_params_and_buffers(resume_data[name], module, require_all=False) File "/home/bairu/workspace/DatasetGEN/styleGan3/torch_utils/misc.py", line 162, in copy_params_andbuffers tensor.copy(src_tensors[name].detach()).requiresgrad(tensor.requires_grad) RuntimeError: output with shape [64, 1, 1, 1] doesn't match the broadcast shape [64, 3, 1, 1]

Expected behavior The target data I want to generate is a single channel grayscale image. When I use grayscale images for training, it will improve this error.

Desktop (please complete the following information):

Additional context If pre-trained models are not used, it is feasible. This seems to be because the input of the pre trained model is three channels? What should I do if I want to use a pre-trained model for training single channel images?

Neilstid commented 6 months ago

This cannot work since you are loading a trained model that have been trained to generate RGB images (3 channels). In my opinion there is three solution: -You modify the training_loop.py so that after loading the Generator only outputs one channel (either R, G or B) -> Not the easiest but I think it may work

I hope it will help you :)