akanimax / BMSG-GAN

[MSG-GAN] Any body can GAN! Highly stable and robust architecture. Requires little to no hyperparameter tuning. Pytorch Implementation
MIT License
629 stars 105 forks source link

I have met this error when run train.py ... #5

Open bemoregt opened 5 years ago

bemoregt commented 5 years ago

Hi, @owang @sridharmahadevan @akanimax @huangzh13

I have met this error when run train.py ... What's wrong to me?

oem@sgi:~/BMSG-GAN/sourcecode$ python3 train.py --depth=7 --latent_size=128 --images_dir='../data/celebJapan/train' --sample_dir=samples/exp_2 --model_dir=models/exp_2 Total number of images in the dataset: 6604

error message - Starting the training process ...

Epoch: 1 Elapsed [0:00:04.581270] batch: 1 d_loss: 4.346926 g_loss: 6.674685 Traceback (most recent call last): File "train.py", line 254, in main(parse_arguments()) File "train.py", line 248, in main start=args.start File "/home/oem/BMSG-GAN/sourcecode/MSG_GAN/GAN.py", line 482, in train gen_img_files) File "/home/oem/BMSG-GAN/sourcecode/MSG_GAN/GAN.py", line 345, in create_grid samples = [Generator.adjust_dynamic_range(sample) for sample in samples] File "/home/oem/BMSG-GAN/sourcecode/MSG_GAN/GAN.py", line 345, in samples = [Generator.adjust_dynamic_range(sample) for sample in samples] File "/home/oem/BMSG-GAN/sourcecode/MSG_GAN/GAN.py", line 96, in adjust_dynamic_range data = data * scale + bias TypeError: mul() received an invalid combination of arguments - got (numpy.float32), but expected one of:

Thanks in advance ~

akanimax commented 5 years ago

Could you please show what is the version of your python, torch and numpy? Please try updating to the latest versions for torch and numpy. The code is tested for python == 3.6.5. Please let me know if you still face this issue.

bemoregt commented 5 years ago

Hi, @owang @sridharmahadevan @akanimax @huangzh13

My Environment:

Ubuntu 17.x x64, Python 3.6.7, CUDA 10.1, Pytorch 0.4.1, numpy 1.15.4

Thanks.

akanimax commented 5 years ago

Could you please try again with python 3.6? The error comes after the first training log itself.

bemoregt commented 5 years ago

Hi, @owang @sridharmahadevan @akanimax @huangzh13

It's same at python3.6 ...

What's wrong to me?

Thanks at any rate .... _;

akanimax commented 5 years ago

Could you try updating pytorch to 1.0.0? I hope this solves the problem.

bemoregt commented 5 years ago

OK, I'll try that...

bemoregt commented 5 years ago

It works , Thanks a lot.

from @bemoregt

akanimax commented 5 years ago

@bemoregt,

I am glad that it is working now. Just wanted to point out that since you are synthesizing Japanese celebs at 256 x 256 resolution, the latent_size = 128 might not be enough to make the generator expressive enough. Please try to use latent_size=512.

Also, if you are able to get good results, please feel free to share these with us, I'll be happy to include them on the readme like @huangzh13's cartoons :smile:.

Hope this helps.

:+1: Best regards, @akanimax

bemoregt commented 5 years ago

But, ...

Elapsed [0:04:07.511359] batch: 108 d_loss: 0.040370 g_loss: 18.472263 Elapsed [0:04:15.999767] batch: 112 d_loss: 0.000000 g_loss: 12.169998 Elapsed [0:04:24.425038] batch: 116 d_loss: 0.053961 g_loss: 16.491339 Elapsed [0:04:32.862795] batch: 120 d_loss: 0.000000 g_loss: 11.238050 Traceback (most recent call last): File "train.py", line 254, in main(parse_arguments()) File "train.py", line 248, in main start=args.start File "/home/oem/BMSG-GAN/sourcecode/MSG_GAN/GAN.py", line 417, in train for (i, batch) in enumerate(data, 1): File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 637, in next return self._process_next_batch(batch) File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch raise batch.exc_type(batch.exc_msg) RuntimeError: Traceback (most recent call last): File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop samples = collate_fn([dataset[i] for i in batch_indices]) File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 209, in default_collate return torch.stack(batch, 0, out=out) RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 3 and 1 in dimension 1 at /pytorch/aten/src/TH/generic/THTensorMoreMath.cpp:1307

another error happens ..

akanimax commented 5 years ago

@bemoregt,

I see. There is no handling of Grayscale image case. I'll fix this by tomorrow when I get access to my code (I am currently travelling). For now, could you please remove all the grayscale (black and white) images from your dataset?

Thanks. @akanimax

bemoregt commented 5 years ago

Hi, @akanimax

OK, I see.

I could understand my data's problems...

My images include some rotated & zero-padded images.

Because of those images, May be It happens...

Many Thanks ~

bemoregt commented 5 years ago

@akanimax

At current sample state epoch=227

https://3.bp.blogspot.com/-KY44bqw_nd8/XMD7JNG-6SI/AAAAAAABAkY/lI9VEv8nhWw4xbMFh4RI8tb8nhkjZuImACLcBGAs/s1600/epoch227.png

https://3.bp.blogspot.com/-KY44bqw_nd8/XMD7JNG-6SI/AAAAAAABAkY/lI9VEv8nhWw4xbMFh4RI8tb8nhkjZuImACLcBGAs/s1600/epoch227.png

bemoregt commented 5 years ago

Hi, @akanimax

celebJapan, epoch=230.., TitanXP + 1080ti

[image: epoch227.png]

Thanks ..

from @bemoregt.

2019년 4월 23일 (화) 오후 6:06, Animesh Karnewar notifications@github.com님이 작성:

Could you try updating pytorch to 1.0.0? I hope this solves the problem.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/akanimax/BMSG-GAN/issues/5#issuecomment-485716655, or mute the thread https://github.com/notifications/unsubscribe-auth/AEUCIZBH7TADL274J3TWE73PR3GRDANCNFSM4HHV3VGA .

huangzh13 commented 5 years ago

Hi, @bemoregt Could you tell me something about your celebJapan dataset?

Best regards.

bemoregt commented 5 years ago

Hi, @huangzh13 @akanimax

Ok, My celebJapan dataset's information is..

Is this too small dataset for MSG-GAN?

Thanks.

akanimax commented 5 years ago

@bemoregt, The results seem good to me given the size of your dataset. BTW, could you share a full size sheet of the generated images. The one you shared seems to be a screenshot of the image viewer. I think you should let it train for longer and one more thing you could try is to calculate the FID of the models for an objective evalutaion. The data size ok for the resolution. Also try increasing the latent size. Hope this helps.

Best regards, @akanimax

bemoregt commented 5 years ago

Hi, @akanimax @huangzh13 @owang @sridharmahadevan

It seems that rotated face is very weak for generation using MSG-GAN.

What is the image augmentation technics suitable for face generating GAN?

Thanks .

from @bemoregt

Pascal900 commented 5 years ago

@bemoregt,

I see. There is no handling of Grayscale image case. I'll fix this by tomorrow when I get access to my code (I am currently travelling). For now, could you please remove all the grayscale (black and white) images from your dataset?

Thanks. @akanimax

Hi, @akanimax

I'd be happy to test MSG-GAN on radiology data.

Is there a way to allow for output grayscale images in your next update?

Thanks!

akanimax commented 5 years ago

@Pascal900,

Great to hear that you would like to use the MSG-GAN for radiology data. Earlier when I said that I'll handle the Grayscale case, I meant just ignoring the grayscale images from the dataset. But for your case, it seems that all the images in the dataset would be grayscale. Will create a new branch for this development. It is a new addition to the network. Till then one thing you could try is to make RGB images from your gray-scale ones. The network will just learn to output the same values for the R, G and B channels. I have tried it before on MNIST data, it worked pretty well.

Please feel free to ask if you have any more queries.

Best regards, @akanimax

mdraw commented 5 years ago

Since I am also working on grayscale radiology data and needed support for that immediately, I've implemented this in #14. @Pascal900, maybe you can try my branch if this use case is still relevant to you. I'd be happy to hear feedback.