taki0112 / MUNIT-Tensorflow

Simple Tensorflow implementation of "Multimodal Unsupervised Image-to-Image Translation" (ECCV 2018)
MIT License
300 stars 90 forks source link

Perceptual Loss missing #9

Open Cuky88 opened 6 years ago

Cuky88 commented 6 years ago

Thanks for your great work @taki0112

I'm curious why you didn't implement the perceptual loss, is there a special reason?

Cheers.

taki0112 commented 6 years ago

To make the code more simpler In original MUNIT, use the pretrained_vgg16_lua_version

So to load this model, we need the load_lua function in pytorch I want to write the code only using tensorflow, so I did't implement the perceptual loss

However, I will also make it possible to do with tensorflow.

Thanks

Cuky88 commented 6 years ago

Hi, thanks for the fast reply.

I saw that and I'm already working on a solution in tf only.

taki0112 commented 6 years ago

Is it possible to PR?

Cuky88 commented 6 years ago

Yes, I'll send you a PR when I'm finished

taki0112 commented 6 years ago

Thanks a lot.

Cuky88 commented 6 years ago

@taki0112 Hi, I think I managed to get perceptual loss to work, but I'm not 100% sure.

Unfortunately I cannot create a PR, since I changed a couple of other things before. Please look at this commit. There is everything for perceptual loss. You can find the download links for the vgg16 weight files in vgg16.py on top in comment section. You can also look in this config file to see which values the arguments for the vgg part has. I only tested everything with the .h5 weight files.

Another issue would be to implement LPIPS Distance also in TF. Do you have such intentions?

It would be very good if you could look over the code, I'm not an expert in TF like you :)

EDIT: this commit is also needed.

MartinMeliss commented 6 years ago

@Cuky88 Hi. I think that you have a bug in the "vgg_preprocess" method in "ops.py". You subtract means from the image, but the range of the image is -1: 1, and your values for the byte representation. And, as far as I understand, VGG-16 requires an unregulated input: line #263 ops.py: channels[i] -= means[i] should look like: channels[i] = (channels[i] + 1.0) * 127.5 - means[i]

Сorrect me if I'm wrong.

Cuky88 commented 6 years ago

@MartinMeliss You are right, image floats are scaled between -1 and 1. So the vgg preprocessing is skrewing up everything. Thanks for pointing out, will fix soon.