icon-lab / SynDiff

Official PyTorch implementation of SynDiff described in the paper (https://arxiv.org/abs/2207.08208).
Other
229 stars 39 forks source link

training error #7

Closed ZhangZhiHao233 closed 1 year ago

ZhangZhiHao233 commented 1 year ago

hi, I got an error when I run the training script, I have prepared my dataset as you describe. my training command is: python3 train.py --image_size 280 --exp exp_syndiff --num_channels 1 --num_channels_dae 64 --ch_mult 1 1 2 2 4 4 --num_timesteps 4 --num_res_blocks 2 --batch_size 1 --contrast1 FS --contrast2 PD --num_epoch 500 --ngf 64 --embedding_type positional --use_ema --ema_decay 0.999 --r1_gamma 1. --z_emb_dim 256 --lr_d 1e-4 --lr_g 1.6e-4 --lazy_reg 10 --num_process_per_node 1 --save_content --local_rank 0 --input_path input_path --output_path output_path

the error is: module_path = /mnt/experiment/code/test_SynDiff/utils/op train data size:2400 val data size:800 initialize network with normal initialize network with normal initialize network with normal initialize network with normal Traceback (most recent call last): File "train.py", line 863, in <module> init_processes(0, size, train_syndiff, args) File "train.py", line 726, in init_processes fn(rank, gpu, args) File "train.py", line 455, in train_syndiff x1_0_predict_diff = gen_diffusive_1(torch.cat((x1_tp1.detach(),x2_0_predict),axis=1), t1, latent_z1) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 705, in forward output = self.module(*inputs[0], **kwargs[0]) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/mnt/experiment/code/test_SynDiff/backbones/ncsnpp_generator_adagn.py", line 307, in forward hs = [modules[m_idx](x)] File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 399, in forward return self._conv_forward(input, self.weight, self.bias) File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 395, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: Given groups=1, weight of size [64, 1, 3, 3], expected input[1, 2, 280, 280] to have 1 channels, but got 2 channels instead

looking forward to your reply, thx!

ZhangZhiHao233 commented 1 year ago

Set num_channels to 2

This is my training command:

python3 train.py --image_size 256 --exp exp_syndiff --num_channels 2 --num_channels_dae 64 --ch_mult 1 1 2 2 4 4 --num_timesteps 4 --num_res_blocks 2 --batch_size 1 --contrast1 FS --contrast2 PD --num_epoch 500 --ngf 64 --embedding_type positional --use_ema --ema_decay 0.999 --r1_gamma 1. --z_emb_dim 256 --lr_d 1e-4 --lr_g 1.6e-4 --lazy_reg 10 --num_process_per_node 1 --save_content --local_rank 0 --save_ckpt_every 2 --input_path input_path --output_path output_path

From: @. @.> on behalf of Fernand Pajot @.> Date: Monday, December 5, 2022 at 23:20 To: icon-lab/SynDiff @.> Cc: ZhangZhiHao233 @.>, Mention @.> Subject: Re: [icon-lab/SynDiff] training error (Issue #7)

@ZhangZhiHao233https://github.com/ZhangZhiHao233 How did you resolve that issue?

— Reply to this email directly, view it on GitHubhttps://github.com/icon-lab/SynDiff/issues/7#issuecomment-1337564090, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AGUZUQDVMXDOMDCC2VUFUTTWLYB4NANCNFSM6AAAAAASP2ICDU. You are receiving this because you were mentioned.Message ID: @.***>