aayushbansal / Recycle-GAN

Unsupervised Video Retargeting (e.g. face to face, flower to flower, clouds and winds, sunrise and sunset)
MIT License
409 stars 94 forks source link

immediate out of memory error when running train.py #6

Closed wanshun123 closed 5 years ago

wanshun123 commented 5 years ago

I've downloaded the faces dataset at https://www.dropbox.com/s/s6kzovbrevin5tr/faces.tar.gz?dl=0 and am running train.py as follows (in my ObamaTrump directory, extracted from faces.tar.gz, there are trainA and trainB folders with each file there containing 3 horizontally concatenated images):

python train.py --dataroot /home/paperspace/rgan/rgan/dataset/faces/ObamaTrump --dataset_mode unaligned_triplet --model recycle_gan

This results in a cuda runtime error (2) : out of memory error. My GPU is a Quadro 4000 with 8GB of memory.

The full stacktrace is as follows:

------------ Options -------------
batchSize: 1
beta1: 0.5
checkpoints_dir: ./checkpoints
continue_train: False
dataroot: /home/paperspace/rgan/rgan/dataset/faces/ObamaTrump
dataset_mode: unaligned_triplet
display_freq: 100
display_id: 1
display_port: 8097
display_single_pane_ncols: 0
display_winsize: 256
epoch_count: 1
fineSize: 256
gpu_ids: [0]
identity: 0.5
init_type: normal
input_nc: 3
isTrain: True
lambda_A: 10.0
lambda_B: 10.0
loadSize: 286
lr: 0.0002
lr_decay_iters: 50
lr_policy: lambda
max_dataset_size: inf
model: recycle_gan
nThreads: 2
n_layers_D: 3
name: experiment_name
ndf: 64
ngf: 64
niter: 100
niter_decay: 100
no_dropout: False
no_flip: False
no_html: False
no_lsgan: False
norm: instance
npf: 8
output_nc: 3
phase: train
pool_size: 1000
print_freq: 100
resize_or_crop: resize_and_crop
save_epoch_freq: 1
save_latest_freq: 2000
serial_batches: False
update_html_freq: 1000
which_direction: AtoB
which_epoch: latest
which_model_netD: basic
which_model_netG: resnet_9blocks
which_model_netP: prediction
-------------- End ----------------
CustomDatasetDataLoader
dataset [UnalignedTripletDataset] was created
#training images = 4721
recycle_gan
initialization method [normal]
/home/paperspace/rgan/rgan/models/networks.py:17: UserWarning: nn.init.normal is now deprecated in favor of nn.init.normal_.
  init.normal(m.weight.data, 0.0, 0.02)
initialization method [normal]
initialization method [normal]
initialization method [normal]
initialization method [normal]
initialization method [normal]
---------- Networks initialized -------------
ResnetGenerator(
  (model): Sequential(
    (0): ReflectionPad2d((3, 3, 3, 3))
    (1): Conv2d(3, 64, kernel_size=(7, 7), stride=(1, 1))
    (2): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (3): ReLU(inplace)
    (4): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
    (5): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (6): ReLU(inplace)
    (7): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
    (8): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (9): ReLU(inplace)
    (10): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (11): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (12): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (13): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (14): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (15): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (16): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (17): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (18): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (19): ConvTranspose2d(256, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
    (20): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (21): ReLU(inplace)
    (22): ConvTranspose2d(128, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
    (23): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (24): ReLU(inplace)
    (25): ReflectionPad2d((3, 3, 3, 3))
    (26): Conv2d(64, 3, kernel_size=(7, 7), stride=(1, 1))
    (27): Tanh()
  )
)
Total number of parameters: 11378179
ResnetGenerator(
  (model): Sequential(
    (0): ReflectionPad2d((3, 3, 3, 3))
    (1): Conv2d(3, 64, kernel_size=(7, 7), stride=(1, 1))
    (2): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (3): ReLU(inplace)
    (4): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
    (5): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (6): ReLU(inplace)
    (7): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
    (8): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (9): ReLU(inplace)
    (10): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (11): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (12): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (13): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (14): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (15): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (16): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (17): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (18): ResnetBlock(
      (conv_block): Sequential(
        (0): ReflectionPad2d((1, 1, 1, 1))
        (1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
        (3): ReLU(inplace)
        (4): Dropout(p=0.5)
        (5): ReflectionPad2d((1, 1, 1, 1))
        (6): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1))
        (7): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
      )
    )
    (19): ConvTranspose2d(256, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
    (20): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (21): ReLU(inplace)
    (22): ConvTranspose2d(128, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
    (23): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (24): ReLU(inplace)
    (25): ReflectionPad2d((3, 3, 3, 3))
    (26): Conv2d(64, 3, kernel_size=(7, 7), stride=(1, 1))
    (27): Tanh()
  )
)
Total number of parameters: 11378179
PredictionNViews(
  (model): UnetGenerator(
    (model): UnetSkipConnectionBlock(
      (model): Sequential(
        (0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
        (1): UnetSkipConnectionBlock(
          (model): Sequential(
            (0): LeakyReLU(negative_slope=0.2, inplace)
            (1): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
            (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
            (3): UnetSkipConnectionBlock(
              (model): Sequential(
                (0): LeakyReLU(negative_slope=0.2, inplace)
                (1): Conv2d(256, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
                (2): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
                (3): UnetSkipConnectionBlock(
                  (model): Sequential(
                    (0): LeakyReLU(negative_slope=0.2, inplace)
                    (1): Conv2d(512, 1024, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
                    (2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
                    (3): UnetSkipConnectionBlock(
                      (model): Sequential(
                        (0): LeakyReLU(negative_slope=0.2, inplace)
                        (1): Conv2d(1024, 1024, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
                        (2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
                        (3): UnetSkipConnectionBlock(
                          (model): Sequential(
                            (0): LeakyReLU(negative_slope=0.2, inplace)
                            (1): Conv2d(1024, 1024, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
                            (2): ReLU(inplace)
                            (3): ConvTranspose2d(1024, 1024, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
                            (4): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
                          )
                        )
                        (4): ReLU(inplace)
                        (5): ConvTranspose2d(2048, 1024, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
                        (6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
                        (7): Dropout(p=0.5)
                      )
                    )
                    (4): ReLU(inplace)
                    (5): ConvTranspose2d(2048, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
                    (6): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
                  )
                )
                (4): ReLU(inplace)
                (5): ConvTranspose2d(1024, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
                (6): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
              )
            )
            (4): ReLU(inplace)
            (5): ConvTranspose2d(512, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
            (6): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
          )
        )
        (2): ReLU(inplace)
        (3): ConvTranspose2d(256, 32, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
        (4): Tanh()
      )
    )
  )
  (model1): Sequential(
    (0): ReflectionPad2d((3, 3, 3, 3))
    (1): Conv2d(3, 8, kernel_size=(7, 7), stride=(1, 1))
    (2): InstanceNorm2d(8, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (3): ReLU(inplace)
    (4): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (5): InstanceNorm2d(16, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (6): ReLU(inplace)
    (7): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (9): ReLU(inplace)
  )
  (model3): Sequential(
    (0): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): InstanceNorm2d(16, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (2): ReLU(inplace)
    (3): Conv2d(16, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (4): InstanceNorm2d(8, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (5): ReLU(inplace)
    (6): ReflectionPad2d((3, 3, 3, 3))
    (7): Conv2d(8, 3, kernel_size=(7, 7), stride=(1, 1))
    (8): Tanh()
  )
)
Total number of parameters: 117199267
PredictionNViews(
  (model): UnetGenerator(
    (model): UnetSkipConnectionBlock(
      (model): Sequential(
        (0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
        (1): UnetSkipConnectionBlock(
          (model): Sequential(
            (0): LeakyReLU(negative_slope=0.2, inplace)
            (1): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
            (2): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
            (3): UnetSkipConnectionBlock(
              (model): Sequential(
                (0): LeakyReLU(negative_slope=0.2, inplace)
                (1): Conv2d(256, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
                (2): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
                (3): UnetSkipConnectionBlock(
                  (model): Sequential(
                    (0): LeakyReLU(negative_slope=0.2, inplace)
                    (1): Conv2d(512, 1024, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
                    (2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
                    (3): UnetSkipConnectionBlock(
                      (model): Sequential(
                        (0): LeakyReLU(negative_slope=0.2, inplace)
                        (1): Conv2d(1024, 1024, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
                        (2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
                        (3): UnetSkipConnectionBlock(
                          (model): Sequential(
                            (0): LeakyReLU(negative_slope=0.2, inplace)
                            (1): Conv2d(1024, 1024, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
                            (2): ReLU(inplace)
                            (3): ConvTranspose2d(1024, 1024, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
                            (4): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
                          )
                        )
                        (4): ReLU(inplace)
                        (5): ConvTranspose2d(2048, 1024, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
                        (6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
                        (7): Dropout(p=0.5)
                      )
                    )
                    (4): ReLU(inplace)
                    (5): ConvTranspose2d(2048, 512, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
                    (6): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
                  )
                )
                (4): ReLU(inplace)
                (5): ConvTranspose2d(1024, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
                (6): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
              )
            )
            (4): ReLU(inplace)
            (5): ConvTranspose2d(512, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
            (6): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
          )
        )
        (2): ReLU(inplace)
        (3): ConvTranspose2d(256, 32, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
        (4): Tanh()
      )
    )
  )
  (model1): Sequential(
    (0): ReflectionPad2d((3, 3, 3, 3))
    (1): Conv2d(3, 8, kernel_size=(7, 7), stride=(1, 1))
    (2): InstanceNorm2d(8, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (3): ReLU(inplace)
    (4): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (5): InstanceNorm2d(16, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (6): ReLU(inplace)
    (7): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (8): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (9): ReLU(inplace)
  )
  (model3): Sequential(
    (0): Conv2d(32, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): InstanceNorm2d(16, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (2): ReLU(inplace)
    (3): Conv2d(16, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (4): InstanceNorm2d(8, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (5): ReLU(inplace)
    (6): ReflectionPad2d((3, 3, 3, 3))
    (7): Conv2d(8, 3, kernel_size=(7, 7), stride=(1, 1))
    (8): Tanh()
  )
)
Total number of parameters: 117199267
NLayerDiscriminator(
  (model): Sequential(
    (0): Conv2d(3, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
    (1): LeakyReLU(negative_slope=0.2, inplace)
    (2): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
    (3): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (4): LeakyReLU(negative_slope=0.2, inplace)
    (5): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
    (6): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (7): LeakyReLU(negative_slope=0.2, inplace)
    (8): Conv2d(256, 512, kernel_size=(4, 4), stride=(1, 1), padding=(1, 1))
    (9): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (10): LeakyReLU(negative_slope=0.2, inplace)
    (11): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), padding=(1, 1))
  )
)
Total number of parameters: 2764737
NLayerDiscriminator(
  (model): Sequential(
    (0): Conv2d(3, 64, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
    (1): LeakyReLU(negative_slope=0.2, inplace)
    (2): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
    (3): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (4): LeakyReLU(negative_slope=0.2, inplace)
    (5): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(1, 1))
    (6): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (7): LeakyReLU(negative_slope=0.2, inplace)
    (8): Conv2d(256, 512, kernel_size=(4, 4), stride=(1, 1), padding=(1, 1))
    (9): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
    (10): LeakyReLU(negative_slope=0.2, inplace)
    (11): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), padding=(1, 1))
  )
)
Total number of parameters: 2764737
-----------------------------------------------
model [RecycleGANModel] was created
create web directory ./checkpoints/experiment_name/web...
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1524590031827/work/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
  File "train.py", line 27, in <module>
    model.optimize_parameters()
  File "/home/paperspace/rgan/rgan/models/recycle_gan_model.py", line 333, in optimize_parameters
    self.backward_G()
  File "/home/paperspace/rgan/rgan/models/recycle_gan_model.py", line 283, in backward_G
    pred_A2 = self.netP_A(self.real_A0, self.real_A1)
  File "/home/paperspace/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/paperspace/rgan/rgan/models/networks.py", line 359, in forward
    f1 = self.model1(input1)
  File "/home/paperspace/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/paperspace/anaconda3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 91, in forward
    input = module(input)
  File "/home/paperspace/anaconda3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 491, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/paperspace/anaconda3/lib/python3.6/site-packages/torch/nn/modules/instancenorm.py", line 50, in forward
    self.training or not self.track_running_stats, self.momentum, self.eps)
  File "/home/paperspace/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py", line 1245, in instance_norm
    eps=eps)
  File "/home/paperspace/anaconda3/lib/python3.6/site-packages/torch/onnx/__init__.py", line 57, in wrapper
    return fn(*args, **kwargs)
  File "/home/paperspace/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py", line 1233, in _instance_norm
    training=use_input_stats, momentum=momentum, eps=eps)
  File "/home/paperspace/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py", line 1194, in batch_norm
    training, momentum, eps, torch.backends.cudnn.enabled
RuntimeError: cuda runtime error (2) : out of memory at /opt/conda/conda-bld/pytorch_1524590031827/work/aten/src/THC/generic/THCStorage.cu:58
wanshun123 commented 5 years ago

With my graphics card I was able to run it with images of 128x128.

claudetheboof commented 5 years ago

maybe consider implement this on Google COlab and run it from there, Colab provides you a free Tesla T4 for doing machine learning, I already using a heavily modified version of ReCycle-GAN on my Colab notebook.

claudetheboof commented 5 years ago

256x256 with batch size of 6 is the max I can go on Google Colab without having to use system's Ram, you could pushed it to Batch size of 8 , but it will train slightly slower ... but the result is worth it! either way whenever I train a GAN in general the more batch size you can squeeze in the better.

ak9250 commented 5 years ago

@claudetheboof do you have to retrain the model for each person, like trump obama and can only model be used for any face and how long does training take, can you also share you colab notebook?