Open akamob opened 2 years ago
Hello,
--preprocess none
will disable cropping. Cropping can be beneficial when you don't have enough samples in your dataset, to prevent overfitting.If the goal is to pass the output images to pose estimation networks, perhaps you can increase the weight on the L1 loss. It will generate less diverse images but will be more faithful to the input data and with less artifacts.
Hi, @taesungp, thank you very much for your reply:)
Yes, I have paired data. But my depth images are distorted, they may not have the accurate correspondence to the same locations with RGB images. This is why I use CycleGAN. Depend on my task. Is pix2pix better than CycleGAN?
In pix2pix_model.py, I found:
parser.add_argument('--lambda_L1', type=float, default=100.0, help='weight for L1 loss')
And CycleGAN (cycle_gan_model.py) has:
lambda_A, lambda_B, default=10.0
lambda_identity, default=0.5
But I don't know how much to increase these weight, this issue increase lambda weight from 10 to 20. Is there any rule to follow? In other words, I'm wondering the reasonable range of loss.
In addition, I can only train for 100 epochs at a time on colab, then I use --continue_train
for the next cycle. Compared to training in one go, I'm not sure if my training configuration properly (learning rate decay problem).
May i have your suggestions? Any help is much appreciated!
Hi, @taesungp, the following are my recently experiment results: My datasets have 9411 depth images, 9081 RGB images, all images are 256x256, and contain three actions: forward/backward, wave hands, and forward bend.
I use this command to train CycleGAN on Colab: !python train.py --dataroot ./datasets/yicyclepix_0322 --name yivlp2rgbhuman --model cycle_gan --n_epochs 100 --n_epochs_decay 100 --epoch_count 180 --continue_train --lambda_A 25 --lambda_B 25 --batch_size 3 --preprocess crop --load_size 256 --crop_size 224 --display_id -1
Training options:
----------------- Options ---------------
batch_size: 3 [default: 1]
beta1: 0.5
checkpoints_dir: ./checkpoints
continue_train: True [default: False]
crop_size: 224 [default: 256]
dataroot: ./datasets/yicyclepix_0322 [default: None]
dataset_mode: unaligned
direction: AtoB
display_env: main
display_freq: 400
display_id: -1 [default: 1]
display_ncols: 4
display_port: 8097
display_server:[ http://localhost](http://localhost/)
display_winsize: 256
epoch: latest
epoch_count: 180 [default: 1]
gan_mode: lsgan
gpu_ids: 0
init_gain: 0.02
init_type: normal
input_nc: 3
isTrain: True [default: None]
lambda_A: 25.0 [default: 10.0]
lambda_B: 25.0 [default: 10.0]
lambda_identity: 0.5
load_iter: 0 [default: 0]
load_size: 256 [default: 286]
lr: 0.0002
lr_decay_iters: 50
lr_policy: linear
max_dataset_size: inf
model: cycle_gan
n_epochs: 100
n_epochs_decay: 100
n_layers_D: 3
name: yivlp2rgbhuman [default: experiment_name]
ndf: 64
netD: basic
netG: resnet_9blocks
ngf: 64
no_dropout: True
no_flip: False
no_html: False
norm: instance
num_threads: 4
output_nc: 3
phase: train
pool_size: 50
preprocess: crop [default: resize_and_crop]
print_freq: 100
save_by_iter: False
save_epoch_freq: 5
save_latest_freq: 5000
serial_batches: False
suffix:
update_html_freq: 1000
use_wandb: False
verbose: False
----------------- End -------------------
After 200 epochs of training, I plot losses: And these videos are test results: For forward/backward: Input, Output. For wave hands: Input, Output. For forward bend: Input, Output.
I add these flags when test my CycleGAN model: --batch_size 3 --preprocess crop --load_size 256 --crop_size 224 --no_dropout As these videos show, the result is not good. Could you give me some suggestions?
Besides, I try another training options: I change --load_size, --crop_size 256, and add --netG: !python train.py --dataroot ./datasets/yicyclepix_0322 --name yivlp2rgbhuman --model cycle_gan --n_epochs 100 --n_epochs_decay 100 --epoch_count 89 --continue_train --lambda_A 25 --lambda_B 25 --batch_size 3 --netG resnet_6blocks --preprocess crop --load_size 286 --crop_size 256 --display_id -1
Training options:
----------------- Options ---------------
batch_size: 3 [default: 1]
beta1: 0.5
checkpoints_dir: ./checkpoints
continue_train: True [default: False]
crop_size: 256
dataroot: ./datasets/yicyclepix_0322 [default: None]
dataset_mode: unaligned
direction: AtoB
display_env: main
display_freq: 400
display_id: -1 [default: 1]
display_ncols: 4
display_port: 8097
display_server:[ http://localhost](http://localhost/)
display_winsize: 256
epoch: latest
epoch_count: 89 [default: 1]
gan_mode: lsgan
gpu_ids: 0
init_gain: 0.02
init_type: normal
input_nc: 3
isTrain: True [default: None]
lambda_A: 25.0 [default: 10.0]
lambda_B: 25.0 [default: 10.0]
lambda_identity: 0.5
load_iter: 0 [default: 0]
load_size: 286
lr: 0.0002
lr_decay_iters: 50
lr_policy: linear
max_dataset_size: inf
model: cycle_gan
n_epochs: 100
n_epochs_decay: 100
n_layers_D: 3
name: yivlp2rgbhuman [default: experiment_name]
ndf: 64
netD: basic
netG: resnet_6blocks [default: resnet_9blocks]
ngf: 64
no_dropout: True
no_flip: False
no_html: False
norm: instance
num_threads: 4
output_nc: 3
phase: train
pool_size: 50
preprocess: crop [default: resize_and_crop]
print_freq: 100
save_by_iter: False
save_epoch_freq: 5
save_latest_freq: 5000
serial_batches: False
suffix:
update_html_freq: 1000
use_wandb: False
verbose: False
----------------- End -------------------
Although now I just train CycleGAN to 133 epochs, images generated during the training process let me know that the learning effect may still be bad:
May i have your suggestions? Any help is much appreciated:)
Hi, Thank you for the awesome work.
I have a VLP-16 lidar and I want to use it for activity recognition, I want to treat point clouds as RGB images so that I can use openpose or other pose estimation neural network. First, I convert my point clouds into depth images and the following are some examples:
I hope CycleGAN can translate these depth images (with specific action) into corresponding RGB images; namely, if my depth images consists of walk, kick, writing, climbing, etc., I hope CycleGAN can not only achieve image-to-image translation but also translate to correct action. Is it possible?
The link are my current results and the following video is input:
https://user-images.githubusercontent.com/49118957/156760287-cd8af0cf-06e7-41e7-9a13-477c365d3c09.mp4
Now my depth images only contain front view of walking and there is only one person in the scene. After 30 epochs training (I only have Colab for training), CycleGAN outputs the above video. I use this command for training: python train.py --dataroot ./datasets/yivlp2yihuman --name yivlp2rgbhuman --model cycle_gan --n_epochs 15 --n_epochs_decay 15
I have some questions to ask:
--preprocess none
in command line?May i have your suggestions? Any help is much appreciated:)