Thank you for sharing the implementation of the DDGAN model. I am trying to train the model on FFHQ 256x256 dataset. I used the NVLabs/NVAE repository for the dataset preparation. I have the file structure as follows:
To use another dataset similar to the CelebA-HQ 256x256, I modified the train function given in the line 190 of the train_ddgan.py file.
My implementation for the DDGAN uses 4 NVIDIA GTX 1080ti GPUs with a total batch size of 32 for training the CelebA-HQ 256x256 dataset
(--batch_size 8 and --num_process_per_node 4)
I use the following command for training!python3 train_ddgan.py --dataset ffhq_256 --image_size 256 --exp ddgan_celebahq_exp1 --num_channels 3 --num_channels_dae 64 --ch_mult 1 1 2 2 4 4 --num_timesteps 2 --num_res_blocks 2 --batch_size 8 --num_epoch 800 --ngf 64 --embedding_type positional --use_ema --r1_gamma 2. --z_emb_dim 256 --lr_d 1e-4 --lr_g 2e-4 --lazy_reg 10 --num_process_per_node 4 --save_content
I am getting the following output message:
Node rank 0, local proc 0, global proc 0
Node rank 0, local proc 1, global proc 1
Node rank 0, local proc 2, global proc 2
Node rank 0, local proc 3, global proc 3
Process Process-4:
Traceback (most recent call last):
File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "train_ddgan.py", line 482, in init_processes
fn(rank, gpu, args)
File "train_ddgan.py", line 248, in train
dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)
File "/home/manisha.padala/gan/denoising-diffusion-gan/datasets_prep/lmdb_datasets.py", line 33, in __init__
self.data_lmdb = lmdb.open(lmdb_path, readonly=True, max_readers=1,
lmdb.Error: /datasets/ffhq-lmdb/train.lmdb: No such file or directory
Process Process-2:
Traceback (most recent call last):
File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "train_ddgan.py", line 482, in init_processes
fn(rank, gpu, args)
File "train_ddgan.py", line 248, in train
dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)
File "/home/manisha.padala/gan/denoising-diffusion-gan/datasets_prep/lmdb_datasets.py", line 33, in __init__
self.data_lmdb = lmdb.open(lmdb_path, readonly=True, max_readers=1,
lmdb.Error: /datasets/ffhq-lmdb/train.lmdb: No such file or directory
Process Process-1:
Traceback (most recent call last):
File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "train_ddgan.py", line 482, in init_processes
fn(rank, gpu, args)
File "train_ddgan.py", line 248, in train
dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)
File "/home/manisha.padala/gan/denoising-diffusion-gan/datasets_prep/lmdb_datasets.py", line 33, in __init__
self.data_lmdb = lmdb.open(lmdb_path, readonly=True, max_readers=1,
lmdb.Error: /datasets/ffhq-lmdb/train.lmdb: No such file or directory
Process Process-3:
Traceback (most recent call last):
File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/usr/local/apps/python-3.8.3/lib/python3.8/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "train_ddgan.py", line 482, in init_processes
fn(rank, gpu, args)
File "train_ddgan.py", line 248, in train
dataset = LMDBDataset(root='/datasets/ffhq-lmdb/', name='ffhq', train=True, transform=train_transform)
File "/home/manisha.padala/gan/denoising-diffusion-gan/datasets_prep/lmdb_datasets.py", line 33, in __init__
self.data_lmdb = lmdb.open(lmdb_path, readonly=True, max_readers=1,
lmdb.Error: /datasets/ffhq-lmdb/train.lmdb: No such file or directory
Thank you for sharing the implementation of the DDGAN model. I am trying to train the model on FFHQ 256x256 dataset. I used the NVLabs/NVAE repository for the dataset preparation. I have the file structure as follows:
To use another dataset similar to the CelebA-HQ 256x256, I modified the
train
function given in the line 190 of thetrain_ddgan.py
file.My implementation for the DDGAN uses 4 NVIDIA GTX 1080ti GPUs with a total batch size of 32 for training the CelebA-HQ 256x256 dataset
(
--batch_size 8
and--num_process_per_node 4
)I use the following command for training
!python3 train_ddgan.py --dataset ffhq_256 --image_size 256 --exp ddgan_celebahq_exp1 --num_channels 3 --num_channels_dae 64 --ch_mult 1 1 2 2 4 4 --num_timesteps 2 --num_res_blocks 2 --batch_size 8 --num_epoch 800 --ngf 64 --embedding_type positional --use_ema --r1_gamma 2. --z_emb_dim 256 --lr_d 1e-4 --lr_g 2e-4 --lazy_reg 10 --num_process_per_node 4 --save_content
I am getting the following output message: