hila-chefer / TargetCLIP

[ECCV 2022] Official PyTorch implementation of the paper Image-Based CLIP-Guided Essence Transfer.
232 stars 27 forks source link

Role of coefficients #6

Closed NotNANtoN closed 2 years ago

NotNANtoN commented 2 years ago

Hello again,

in these marked lines you initialize a set of coefficients to optimize over. As far as I can see, these are not mentioned in the paper. The coefficients are multiplied by the direction per source image, so I get that you want to optimize for a different scale of the direction vector per source vector. I have some questions on this:

  1. Did you try it without these coefficients?
  2. To what values do the coefficients converge to? Do they stay close to 1?
  3. You re-initialize the Adam optimizer for the coefficients for every step within the optimization, hence drastically changing the behavior of the optimizer. Is this intended or a misplacement? If it is intended, what is it used for?

Thanks again for your work! I hope I am not too picky on this - I'm just curious about the topic of semantics in these latent spaces :-)

https://github.com/hila-chefer/TargetCLIP/blob/b5dd2a492bf436fa26cfa4c02021a957b6a2a5ec/optimization/find_dirs.py#L123-L140

hila-chefer commented 2 years ago

Hi @NotNANtoN :) First, please feel free to ask anything, I'm happy to answer :) Indeed, we do not mention the optimization of coefficients in our paper, since it's a short 4-page paper (+ no supplementary). In addition, as you observed, the coefficients are used to allow for finer manipulation of each source. Intuitively, if the source and target are semantically close (say both have a beard), we would want to apply a smaller change to the source to resemble the target.

  1. yes, a few times, overall the results were very similar.
  2. from what I observed, usually in the range: 0.5-1.8, this is also around the same range we provide in our notebooks :)
  3. you are absolutely right, this is a misplacement. The optimizer should be initialized for each direction, but not for each step. I'll fix this in my next code update (soon), thanks for the catch! In the meantime, to address 3, I tried our joker training with the fix, and as you can see (results on the training set with the optimized coefficients) the difference in results isn't major. I hope I was able to answer all your questions :) image
hila-chefer commented 2 years ago

Hi @NotNANtoN, I’m closing this issue due to inactivity, but feel free to reopen if necessary.

NotNANtoN commented 2 years ago

Thanks a lot, your answers were very insightful :)