taki0112 / UGATIT

Official Tensorflow implementation of U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation (ICLR 2020)
MIT License
6.17k stars 1.05k forks source link

CAM loss is degrading the results #108

Closed sudarshanregmi closed 3 years ago

sudarshanregmi commented 3 years ago

I'm using the light version of UGATIT for a project. The introduction of CAM loss seems to be degrading the quality of results.

I'm wondering whether the CAM loss improves the model performance only in the case of a heavy model. Is it so? Any help is appreciated. Thank you.

bcccat commented 3 years ago

I have had the same experiences. In my case, there are basically two reasons for this:

  1. The task of your project doesn't require much shape changes, or even hates to do so.
  2. Contents of the dataset you are using are not camera images that are RGB based, which might make the CAM part acting unexpectedly.

As for your guess, I don't see the reason for CAM loss to fail in light mode as only an extra pooling is performed for light mode. Also I tested a similar task as proposed in the original paper with different datasets, they worked in light mode (but not as good as heavy mode for sure).

Hope my answer helps, cheers.

sudarshanregmi commented 3 years ago

Thank you for your response! I really appreciate it.

Yeah, your points seem valid in my case too. I also think images I'm using don't require much shape changes. And, yes, I'm working with single channel images. So, CAM seems to be an overkill.

Moreover, any thoughts on the effect of CAM using the paired dataset instead of unpaired dataset? Right now, I think CAM will be more effective with unpaired dataset instead of paired one.

bcccat commented 3 years ago

Thank you for your response! I really appreciate it.

Yeah, your points seem valid in my case too. I also think images I'm using don't require much shape changes. And, yes, I'm working with single channel images. So, CAM seems to be an overkill.

Moreover, any thoughts on the effect of CAM using the paired dataset instead of unpaired dataset? Right now, I think CAM will be more effective with unpaired dataset instead of paired one.

Glad that my answer helps. CAM, in my opinion, will certainly help for paired datasets. By recalling how CAM works, the reason can be obvious: the network knows which activation map to focus using CAM. Problem is that you might need to make sure CAM highlights the exact spatial position as you want to.

gdwei commented 3 years ago

2. RGB based

Hi, why is the CAM can only work with RGB based images? Do you have any insights?

bcccat commented 3 years ago
  1. RGB based

Hi, why is the CAM can only work with RGB based images? Do you have any insights?

Simply declaring CAM only works with RGB might be wrong, at least not accurate. By 'CAM might act unexpectedly', I merely want to say that CAM embedded in a cycle-GAN style network may not act as you would hope to, since CAM part is upgraded with the guidance from Discriminator in an unsupervised manner.
BTW, I believe that this issue can now be closed, right? @sudarshanregmi

sudarshanregmi commented 3 years ago

Oh, yes @bcccat. Thanks for your help! 👍