Closed nightrain-vampire closed 3 months ago
Would you provide more detail about the experiment? I'm more than willing to help you with that. the error seems the target_num != params_num, the details code in core/tasks/classification.py 's test_g_model function. The params_num is generated parameter's num, and the target_num is replaced parameter's num in validation.
Would you provide more detail about the experiment? I'm more than willing to help you with that. the error seems the target_num != params_num, the details code in core/tasks/classification.py 's test_g_model function. The params_num is generated parameter's num, and the target_num is replaced parameter's num in validation.
Sure. The size of the dataset is 28×28 with 3 channels;The ConvNet-3 I use is below:
class ConvNet(nn.Module):
def __init__(self, channel, num_classes, net_width, net_depth, net_act, net_norm, net_pooling, im_size = (28,28)):
super(ConvNet, self).__init__()
self.features, shape_feat = self._make_layers(channel, net_width, net_depth, net_norm, net_act, net_pooling, im_size)
num_feat = shape_feat[0]*shape_feat[1]*shape_feat[2]
self.classifier = nn.Linear(num_feat, num_classes)
def forward(self, x):
out = self.features(x)
out = out.view(out.size(0), -1)
out = self.classifier(out)
return out
def embed(self, x):
out = self.features(x)
out = out.view(out.size(0), -1)
return out
def _get_activation(self, net_act):
if net_act == 'sigmoid':
return nn.Sigmoid()
elif net_act == 'relu':
return nn.ReLU(inplace=True)
elif net_act == 'leakyrelu':
return nn.LeakyReLU(negative_slope=0.01)
elif net_act == 'swish':
return Swish()
else:
exit('unknown activation function: %s'%net_act)
def _get_pooling(self, net_pooling):
if net_pooling == 'maxpooling':
return nn.MaxPool2d(kernel_size=2, stride=2)
elif net_pooling == 'avgpooling':
return nn.AvgPool2d(kernel_size=2, stride=2)
elif net_pooling == 'none':
return None
else:
exit('unknown net_pooling: %s'%net_pooling)
def _get_normlayer(self, net_norm, shape_feat):
# shape_feat = (c*h*w)
if net_norm == 'batchnorm':
return nn.BatchNorm2d(shape_feat[0], affine=True)
elif net_norm == 'layernorm':
return nn.LayerNorm(shape_feat, elementwise_affine=True)
elif net_norm == 'instancenorm':
return nn.GroupNorm(shape_feat[0], shape_feat[0], affine=True)
elif net_norm == 'groupnorm':
return nn.GroupNorm(4, shape_feat[0], affine=True)
elif net_norm == 'none':
return None
else:
exit('unknown net_norm: %s'%net_norm)
def _make_layers(self, channel, net_width, net_depth, net_norm, net_act, net_pooling, im_size):
layers = []
in_channels = channel
# if im_size[0] == 28:
# im_size = (32, 32)
shape_feat = [in_channels, im_size[0], im_size[1]]
for d in range(net_depth):
# layers += [nn.Conv2d(in_channels, net_width, kernel_size=3, padding=3 if channel == 1 and d == 0 else 1)]
layers += [nn.Conv2d(in_channels, net_width, kernel_size=3, padding=1)]
shape_feat[0] = net_width
if net_norm != 'none':
layers += [self._get_normlayer(net_norm, shape_feat)]
layers += [self._get_activation(net_act)]
in_channels = net_width
if net_pooling != 'none':
layers += [self._get_pooling(net_pooling)]
shape_feat[1] //= 2
shape_feat[2] //= 2
return nn.Sequential(*layers), shape_feat
I set the train_layer = 'all' since I am not sure which layer is need to be finetuned. Now the target num is 307591, but the params_num is still 2048. In fact, when I change the ae_model.in_dim into 307591 in the config file, the program runs all right! Why? Can you help me?
The bug is because the autoencoder model and ddpm model don't adjust to parameter.
In configs/system/ae_ddpm, the code about setting the backbone:
ae_model: _target_: core.module.modules.encoder.medium in_dim: 2048 input_noise_factor: 0.001 latent_noise_factor: 0.5 model: arch: _target_: core.module.wrapper.ema.EMA model: _target_: core.module.modules.unet.AE_CNN_bottleneck in_channel: 1 in_dim: 12
The ae_model.in_dim must equal to target_num, and the (model.in_channel, model.in_dim) must equal to latent shape.
Glad to be able to help you. In fact, the in_dim is a hyperparameter in autoencoder, the detail code in core/module/modules/encoder.py. As you know, we use 1D convolutional layer to extract the feature in parameter, the in_dim is a parameter to build model, if not set correctly, it doesn't fit to parameter num. what's more, you must set correct (in_channel, in_dim) in unet model. If not, it will encounter a bug in the process that train unet for diffusion.
I run the
train_p_diff.py
with the modelConvNet-3
and the system 'ae_ddpm', the trainer-layer isall
; However, it reports AssertionError when running test_g_model:It seems that the target_num is 307591 but the params_num is 2048. What happens? How can I fix the bug?