Open yliats opened 4 years ago
I am facing a similar issue too
https://github.com/NVIDIA/pix2pixHD/compare/master...WenliangDu:master . Check out the changes here, I made these changes and it worked.
master...WenliangDu:master . Check out the changes here, I made these changes and it worked.
I had to do some minor adjustments after following your changes, but it worked! Thank you very much. I was expecting the calculation time per epoch to decrease, since the architecture is thinner now (less channels in generator output), so I was surprised to see that it increased a little.
Hello, I would like to use the semantic segmentation images to improve my training.I have the RGB directory named as train_A,another RGB directory named as train_B,, and the semantic segmentation images named as train_inst.My semantic segmentation images have 2 channels,that N=2.
My parameters are:
------------ Options -------------
batchSize: 1
beta1: 0.5
checkpoints_dir: ./checkpoints
continue_train: False
data_type: 32
dataroot: ./datasets/test/
debug: False
display_freq: 100
display_winsize: 512
feat_num: 2
fineSize: 512
fp16: False
gpu_ids: [0]
input_nc: 3
instance_feat: False
isTrain: True
label_feat: True
label_nc: 0
lambda_feat: 10.0
loadSize: 1024
load_features: False
load_pretrain:
local_rank: 0
lr: 0.0002
max_dataset_size: inf
model: pix2pixHD
nThreads: 2
n_blocks_global: 9
n_blocks_local: 3
n_clusters: 2
n_downsample_E: 4
n_downsample_global: 4
n_layers_D: 3
n_local_enhancers: 1
name: test
ndf: 64
nef: 16
netG: global
ngf: 64
niter: 100
niter_decay: 100
niter_fix_global: 0
no_flip: False
no_ganFeat_loss: False
no_html: False
no_instance: True
no_lsgan: False
no_vgg_loss: False
norm: instance
num_D: 2
output_nc: 3
phase: train
pool_size: 0
print_freq: 100
resize_or_crop: scale_width
save_epoch_freq: 10
save_latest_freq: 1000
serial_batches: False
tf_log: False
use_dropout: False
verbose: False
which_epoch: latest
-------------- End ----------------
CustomDatasetDataLoader
dataset [AlignedDataset] was created
#training images = 1
GlobalGenerator(
(model): Sequential(
(0): ReflectionPad2d((3, 3, 3, 3))
(1): Conv2d(5, 64, kernel_size=(7, 7), stride=(1, 1))
(2): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(5): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(6): ReLU(inplace=True)
(7): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(8): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(9): ReLU(inplace=True)
(10): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(11): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(12): ReLU(inplace=True)
(13): Conv2d(512, 1024, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(14): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(15): ReLU(inplace=True)
(16): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(17): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(18): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(19): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(20): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(21): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(22): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(23): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(24): ResnetBlock(
(conv_block): Sequential(
(0): ReflectionPad2d((1, 1, 1, 1))
(1): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(2): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): ReflectionPad2d((1, 1, 1, 1))
(5): Conv2d(1024, 1024, kernel_size=(3, 3), stride=(1, 1))
(6): InstanceNorm2d(1024, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
)
)
(25): ConvTranspose2d(1024, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(26): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(27): ReLU(inplace=True)
(28): ConvTranspose2d(512, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(29): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(30): ReLU(inplace=True)
(31): ConvTranspose2d(256, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(32): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(33): ReLU(inplace=True)
(34): ConvTranspose2d(128, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(35): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(36): ReLU(inplace=True)
(37): ReflectionPad2d((3, 3, 3, 3))
(38): Conv2d(64, 3, kernel_size=(7, 7), stride=(1, 1))
(39): Tanh()
)
)
MultiscaleDiscriminator(
(scale0_layer0): Sequential(
(0): Conv2d(6, 64, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale0_layer1): Sequential(
(0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale0_layer2): Sequential(
(0): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale0_layer3): Sequential(
(0): Conv2d(256, 512, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
(1): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale0_layer4): Sequential(
(0): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
)
(scale1_layer0): Sequential(
(0): Conv2d(6, 64, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale1_layer1): Sequential(
(0): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale1_layer2): Sequential(
(0): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2), padding=(2, 2))
(1): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale1_layer3): Sequential(
(0): Conv2d(256, 512, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
(1): InstanceNorm2d(512, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(2): LeakyReLU(negative_slope=0.2, inplace=True)
)
(scale1_layer4): Sequential(
(0): Conv2d(512, 1, kernel_size=(4, 4), stride=(1, 1), padding=(2, 2))
)
(downsample): AvgPool2d(kernel_size=3, stride=2, padding=[1, 1])
)
Encoder(
(model): Sequential(
(0): ReflectionPad2d((3, 3, 3, 3))
(1): Conv2d(3, 16, kernel_size=(7, 7), stride=(1, 1))
(2): InstanceNorm2d(16, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(3): ReLU(inplace=True)
(4): Conv2d(16, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(5): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(6): ReLU(inplace=True)
(7): Conv2d(32, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(8): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(9): ReLU(inplace=True)
(10): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(11): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(12): ReLU(inplace=True)
(13): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
(14): InstanceNorm2d(256, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(15): ReLU(inplace=True)
(16): ConvTranspose2d(256, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(17): InstanceNorm2d(128, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(18): ReLU(inplace=True)
(19): ConvTranspose2d(128, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(20): InstanceNorm2d(64, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(21): ReLU(inplace=True)
(22): ConvTranspose2d(64, 32, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(23): InstanceNorm2d(32, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(24): ReLU(inplace=True)
(25): ConvTranspose2d(32, 16, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), output_padding=(1, 1))
(26): InstanceNorm2d(16, eps=1e-05, momentum=0.1, affine=False, track_running_stats=False)
(27): ReLU(inplace=True)
(28): ReflectionPad2d((3, 3, 3, 3))
(29): Conv2d(16, 2, kernel_size=(7, 7), stride=(1, 1))
(30): Tanh()
)
)
create web directory ./checkpoints\test\web...
C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:84: block: [1,0,0], thread: [123,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:84: block: [1,0,0], thread: [124,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:84: block: [1,0,0], thread: [125,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:84: block: [1,0,0], thread: [126,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
C:/cb/pytorch_1000000000000/work/aten/src/ATen/native/cuda/IndexKernel.cu:84: block: [1,0,0], thread: [127,0,0] Assertion `index >= -sizes[i] && index < sizes[i] && "index out of bounds"` failed.
Traceback (most recent call last):
File "train_win10.py", line 148, in <module>
train()
File "train_win10.py", line 73, in train
Variable(data['image']), Variable(data['feat']), infer=save_fake)
File "C:\Users\*****\anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "C:\Users\*****\anaconda3\envs\pytorch\lib\site-packages\torch\nn\parallel\data_parallel.py", line 159, in forward
return self.module(*inputs[0], **kwargs[0])
File "C:\Users\*****\anaconda3\envs\pytorch\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "D:\Desktop\pix2pixHD-master\models\pix2pixHD_model.py", line 159, in forward
feat_map = self.netE.forward(real_image, inst_map)
File "D:\Desktop\pix2pixHD-master\models\networks.py", line 285, in forward
indices = (inst[b:b+1] == int(i)).nonzero() # n x 4
RuntimeError: CUDA error: device-side assert triggered
what is my problem?Is it because of the problem of semantic segmentation images or the need to modify the code?If so, how to modify it?
Thanks for your time.
master...WenliangDu:master . Check out the changes here, I made these changes and it worked.
I had to do some minor adjustments after following your changes, but it worked! Thank you very much. I was expecting the calculation time per epoch to decrease, since the architecture is thinner now (less channels in generator output), so I was surprised to see that it increased a little.
Could you please explain how you modify the code to convert the 3-channel input to 1-channel output? From my perspective, master...WenliangDu:master only accept 1-channel input.
Hello. First of all I would like to say that your results are very impressive. My research involves image to image translation in high resolution while preserving fine details. I have a dataset of RGB images (1280x760) and corresponding grayscale images that I would like the network to learn to generate based on the RGB.
I have the RGB directory named as train_A, the grayscale images directory named as train_B, and the semantic segmentation images (that will be converted to edge_map?) named as train_inst.
some of the parameters that I've used:
Does the algorithm automatically reads the images from train_B in grayscale (because output_nc = 1)? No matter what I try, I always get some kind of dimension mismatch.
What is my mistake?
Another problem that I had is that for some reason, when I debug, the CPU load is 100% even though there shouldn't be any calculations. This makes debugging very hard.
Thank you.