instantX-research / InstantID

InstantID: Zero-shot Identity-Preserving Generation in Seconds 🔥
https://instantid.github.io/
Apache License 2.0
10.94k stars 796 forks source link

Out of memory issue with CUDA #47

Closed vash-pika closed 8 months ago

vash-pika commented 8 months ago

Tried to run the code snippet from the readme and ran into a CUDA memory issue with an A100 which seems off

from diffusers.models import ControlNetModel
from pipeline_stable_diffusion_xl_instantid import StableDiffusionXLInstantIDPipeline, draw_kps

face_adapter = f'./checkpoints/ip-adapter.bin'
controlnet_path = f'./checkpoints/ControlNetModel'

app = FaceAnalysis(name='antelopev2', providers=['CUDAExecutionProvider', 'CPUExecutionProvider'])
app.prepare(ctx_id=0, det_size=(640, 640))

controlnet = ControlNetModel.from_pretrained(controlnet_path, torch_dtype=torch.float16)
base_model = 'wangqixun/YamerMIX_v8'  # from https://civitai.com/models/84040?modelVersionId=196039
pipe = StableDiffusionXLInstantIDPipeline.from_pretrained(
    base_model,controlnet=controlnet, torch_dtype=torch.float16
)
pipe.cuda()
pipe.load_ip_adapter_instantid(face_adapter)

faces = app.get(img_cv)
faces = sorted(faces, key=lambda x:(x['bbox'][2]-x['bbox'][0])*x['bbox'][3]-x['bbox'][1])[-1] # only use the maximum face
face_emb = faces.embedding
face_kps = draw_kps(img, faces.kps)

pipe.set_ip_adapter_scale(0.8)
negative_prompt = "(lowres, low quality, worst quality:1.2), (text:1.2), watermark, painting, drawing, illustration, glitch, deformed, mutated, cross-eyed, ugly, disfigured (lowres, low quality, worst quality:1.2), (text:1.2), watermark, painting, drawing, illustration, glitch,deformed, mutated, cross-eyed, ugly, disfigured"

# generate image
with torch.no_grad():
    image = pipe(
        "film noir style", image_embeds=face_emb, image=face_kps, controlnet_conditioning_scale=0.8
    ).images[0]
vash-pika commented 8 months ago

I think my image resolution was too high, which affects the size of the face_kps

jatin9909 commented 6 months ago

Can you tell me what are the image size limitations? I mean up to which size and dimensions of the image you were not facing running out of CUDA memory.