irasin / Pytorch_AdaIN

Pytorch implementation from scratch of [Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization [Huang+, ICCV2017]]
GNU General Public License v3.0
133 stars 26 forks source link

Model isn't learning #7

Closed sonnguyen129 closed 2 years ago

sonnguyen129 commented 3 years ago

I trained model for 20 epochs. It seem doesn't learn anything. Can you help me to fix it? 1_epoch_2000_iteration

irasin commented 3 years ago

Hi,@sonnguyentruong129 I am not sure if this code works well on the newest PyTorch now. However, NST learning is quite sensitive to the learning rate, fusion rate, initial value of weights, or others such as the PyTorch codes themselves.

As my suggestions, you can set the AdaIN fusion rate of contents features and style features to 0 and try to learn the model again. In this case, this model is just learning an AutoEncoder. It should generate results well without 20 epochs.

If the AutoEncoders' learning is ok, you can try using the weights from AutoEncoders as the AdaIN Model, which means setting the fusion rate to the normal values as you want.

Please have a try, Thanks.

sonnguyen129 commented 3 years ago

Do you mean AdaIN fussion rate is alpha or lam variable?

irasin commented 3 years ago

The fusion rate is alpha, just follow the code t = alpha * t + (1 - alpha) * content_featuresfrom the model.py.


    def generate(self, content_images, style_images, alpha=1.0):
        content_features = self.vgg_encoder(content_images, output_last_feature=True)
        style_features = self.vgg_encoder(style_images, output_last_feature=True)
        t = adain(content_features, style_features)
        t = alpha * t + (1 - alpha) * content_features
        out = self.decoder(t)
        return out
sonnguyen129 commented 2 years ago

Thank you for your reply. Can you explain that use the _resize function in dataset.py, then using random crop. Why not use torchvision's random crop or resize from the start?

irasin commented 2 years ago

Hi, @sonnguyen129 , It is very OK by just using torchvision. However, I didn't know how to use it when I try to write this repo because I am not familiar with pytorch 3 years ago. That is the reason why I do resize by myself.