nupurkmr9 / concept-ablation

Ablating Concepts in Text-to-Image Diffusion Models (ICCV 2023)
https://www.cs.cmu.edu/~concept-ablation
MIT License
144 stars 20 forks source link

How to properly use #3

Open dimentox opened 1 year ago

dimentox commented 1 year ago

I am on windows i got around several issues things like workers and such. I converted a safetensor to a diffuser model did all the things.. reran my inferance... looks the same..

what now? how do i check?

Basically trying to remove all kids, children etc from the model.

Metrics: {'logs_delta': {'KID_woman': [0.0018922471393623937], 'CLIP scores_woman_woman': 0.57619140625, 'CLIP accuracy_woman_kids': 0.88, 'Baseline CLIP scores_woman_woman': 0.576943359375, 'Baseline CLIP accuracy_woman_kids': 0.88, 'KID_adult': [0.0020961230074179353], 'CLIP scores_adult_adult': 0.568818359375, 'CLIP accuracy_adult_kids': 0.82, 'Baseline CLIP scores_adult_adult': 0.572763671875, 'Baseline CLIP accuracy_adult_kids': 0.84, 'KID_man': [0.0035362772451071577], 'CLIP scores_man_man': 0.5415087890625, 'CLIP accuracy_man_kids': 0.52, 'Baseline CLIP scores_man_man': 0.5546337890625, 'Baseline CLIP accuracy_man_kids': 0.7, 'KID_adults': [0.01024491617345339], 'CLIP scores_adults_adults': 0.61166015625, 'CLIP accuracy_adults_kids': 0.92, 'Baseline CLIP scores_adults_adults': 0.6040234375, 'Baseline CLIP accuracy_adults_kids': 0.92, 'KID_kids': [0.009248548611111056], 'CLIP scores_kids_kids': 0.6119921875, 'CLIP accuracy_kids_kids': 0.0, 'Baseline CLIP scores_kids_kids': 0.616201171875, 'Baseline CLIP accuracy_kids_kids': 0.0, 'KID_kids kid': [0.013139375098642683], 'CLIP scores_kids kid_kids kid': 0.6053955078125, 'CLIP accuracy_kids kid_kids': 0.245, 'Baseline CLIP scores_kids kid_kids kid': 0.609755859375, 'Baseline CLIP accuracy_kids kid_kids': 0.58}}

how do i use the finished thing? did it modify the diffuser and i need to make into a checkpoint again? do i do something with the delta?

More instructions would be good.

nupurkmr9 commented 1 year ago

Hi, Thanks for your interest in our work. The following code snippet in the README can be used to perform inference using the saved delta.bin file.

from model_pipeline import CustomDiffusionPipeline
import torch

pipe = CustomDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16).to("cuda")
pipe.load_model('delta.bin')
image = pipe("painting of a kid", num_inference_steps=50, guidance_scale=6., eta=1.).images[0]

image.save("kid_removed.png")

Further, if you want to save the model as a default diffusers checkpoint instead of only the delta. You can do so by calling pipe.save_pretrained(all=True) after load_model in the above code snippet.

If the kid_removed.png doesn't look like the kid has been removed, some of the things to try are:

  1. Train for longer iterations.
  2. Use other synonyms of kid, e.g., child, etc., as well by providing --caption_target "person+kid;person+child" where person is the anchor concept that replaces kid.

I hope this helps. Let me know if you still face any issues.