ACM MM'23 (oral), SUR-adapter for pre-trained diffusion models can acquire the powerful semantic understanding and reasoning capabilities from large language models to build a high-quality textual semantic representation for text-to-image generation.
使用的推理代码如下,权重使用https://drive.google.com/drive/folders/1UyC9_AqTezmHXmj4dh0A-9RBKKx_JmJZ import os os.environ['CUDA_VISIBLE_DEVICES']='0'
from SUR_adapter_pipeline import SURStableDiffusionPipeline import torch from SUR_adapter import Adapter
adapter_path = "adapter_checkpoint.pt" adapter=Adapter().to("cuda") adapter.load_state_dict(torch.load(adapter_path)) adapter.adapter_weight = 0.1
model_path = "runwayml/stable-diffusion-v1-5" pipe = SURStableDiffusionPipeline.from_pretrained(model_path, adapter=adapter) pipe.to("cuda") pipe.safety_checker = lambda images, clip_input: (images, False)
image = pipe(prompt='An aristocratic maiden in medieval attire with a headdress of brilliant feathers').images[0] image.save("output.jpg")