hila-chefer / TargetCLIP

[ECCV 2022] Official PyTorch implementation of the paper Image-Based CLIP-Guided Essence Transfer.
232 stars 27 forks source link

Problem about finding direction #8

Closed rainsoulsrx closed 2 years ago

rainsoulsrx commented 2 years ago

I ran this command to reproduce your results: python3 optimization/find_dirs.py --target_path dirs/targets/avatar.jpg --dir_name results_folder_avatar --weight_decay 3e-3 --lambda_consistency 0.6 --step 1000 --lr 0.2 --num_directions 8 --num_images 8 But I got strange result as bellow: here is some results. Anything wrong with me? image

hila-chefer commented 2 years ago

Hi @rainsoulsrx, thanks for your interest in our work!

I’m guessing the image you attach here contains two of the directions you trained (0,4 right?)

For targets that involve a more extreme change to the semantic attributes, it is more difficult to obtain the direction you need without an identity change. Therefore, some initializations work better than others, thus some of your resulting directions are good (such as 4 in your case), and some fail (such as direction 0 in your case). This is one of the reasons why we train 8 directions for non-inverted targets.

In general, my suggestion is that when you have a difficult target and you don’t have an inversion of the target (an initialization of the direction with an inversion works better), you can increase the influence of the clip similarity score using the lambda_transfer argument to find_dirs. In your case, I’d try something like: python3 optimization/find_dirs.py --target_path dirs/targets/avatar.jpg --dir_name results_folder_avatar --weight_decay 3e-3 --lambda_consistency 0.6 --step 1000 --lr 0.2 --num_directions 8 --num_images 8 --lambda_transfer 1.5

Although even without increasing the influence of the similarity score, you got at least one good direction to use 👍

We are working on improving the code to avoid these cases, so stay tuned :)

Best, Hila.

rainsoulsrx commented 2 years ago

Hi @rainsoulsrx, thanks for your interest in our work!

I’m guessing the image you attach here contains two of the directions you trained (0,4 right?)

For targets that involve a more extreme change to the semantic attributes, it is more difficult to obtain the direction you need without an identity change. Therefore, some initializations work better than others, thus some of your resulting directions are good (such as 4 in your case), and some fail (such as direction 0 in your case). This is one of the reasons why we train 8 directions for non-inverted targets.

In general, my suggestion is that when you have a difficult target and you don’t have an inversion of the target (an initialization of the direction with an inversion works better), you can increase the influence of the clip similarity score using the lambda_transfer argument to find_dirs. In your case, I’d try something like: python3 optimization/find_dirs.py --target_path dirs/targets/avatar.jpg --dir_name results_folder_avatar --weight_decay 3e-3 --lambda_consistency 0.6 --step 1000 --lr 0.2 --num_directions 8 --num_images 8 --lambda_transfer 1.5

Although even without increasing the influence of the similarity score, you got at least one good direction to use 👍

We are working on improving the code to avoid these cases, so stay tuned :)

Best, Hila.

Hi, thanks for your quick reply. you mean I can get the inversion of the target with e4e(that is something like me.pt), and use me.pt as the --target_path dirs/targets/me.pt, the results will be better?

hila-chefer commented 2 years ago

Happy to help :) I’m saying that usually initializing the direction with the target’s inversion is helpful. Increasing lambda_transfer is another way to help with difficult targets and it should work too. Correct me if I’m wrong but direction 4 is good in your case right?

If you wish to experiment with inverted targets, see this section of our readme for detailed instructions (you would indeed need to invert the target with e4e for that, but as I mentioned it’s not obligatory since you have a successful direction + you can always increase lambda_transfer and get better results.

I hope this helps.

rainsoulsrx commented 2 years ago

Happy to help :) I’m saying that usually initializing the direction with the target’s inversion is helpful. Increasing lambda_transfer is another way to help with difficult targets and it should work too. Correct me if I’m wrong but direction 4 is good in your case right?

If you wish to experiment with inverted targets, see this section of our readme for detailed instructions (you would indeed need to invert the target with e4e for that, but as I mentioned it’s not obligatory since you have a successful direction + you can always increase lambda_transfer and get better results.

I hope this helps.

yeah I got it. Direction 4 is better than other directions. Thanks for you help~~