awei669 / VQ-Font

[ICCV 2023] Few shot font generation via transferring similarity guided global and quantization local styles
https://arxiv.org/abs/2309.00827
127 stars 6 forks source link

About the training of the VQ-VAE. #7

Open LsFlyt opened 9 months ago

LsFlyt commented 9 months ago

In VQ-Font/model/VQ-VAE.ipynb.

for i in xrange(num_training_updates): data = next(iter(train_loader)) train_data_variance = torch.var(data)

print(train_data_variance)

# show(make_grid(data.cpu().data) )
# break
data = data - 0.5 # normalize to [-0.5, 0.5]
data = data.to(device)
optimizer.zero_grad()

The code normalize data to [-0.5, 0.5]. However, the last layer of the decoder of the VQ-VAE model is sigmoid. Is this a mistake?

LsFlyt commented 9 months ago

And another question, in VQ-VAE, the data are normalized to [-0.5, 0.5], but in the training phase 2, the content image (which feeds to the content encoder) is normalized [-1, 1].

awei669 commented 9 months ago

Sorry for the late reply, i can not remember the reason i use 'data = data - 0.5 # normalize to [-0.5, 0.5]' due to the long time. Maybe is a mistake or not.