Closed NotNANtoN closed 2 years ago
Hi @NotNANtoN :) First, please feel free to ask anything, I'm happy to answer :) Indeed, we do not mention the optimization of coefficients in our paper, since it's a short 4-page paper (+ no supplementary). In addition, as you observed, the coefficients are used to allow for finer manipulation of each source. Intuitively, if the source and target are semantically close (say both have a beard), we would want to apply a smaller change to the source to resemble the target.
Hi @NotNANtoN, I’m closing this issue due to inactivity, but feel free to reopen if necessary.
Thanks a lot, your answers were very insightful :)
Hello again,
in these marked lines you initialize a set of coefficients to optimize over. As far as I can see, these are not mentioned in the paper. The coefficients are multiplied by the direction per source image, so I get that you want to optimize for a different scale of the direction vector per source vector. I have some questions on this:
Thanks again for your work! I hope I am not too picky on this - I'm just curious about the topic of semantics in these latent spaces :-)
https://github.com/hila-chefer/TargetCLIP/blob/b5dd2a492bf436fa26cfa4c02021a957b6a2a5ec/optimization/find_dirs.py#L123-L140