Closed satwiksunnam19 closed 1 year ago
Hey! Thanks.
As long as you don't exceed your memory limit on the GPU, you can do that by supplying a list of prompts from the interface perspective.
This is an example I run with the canny edge in the supplied colab notebook.
pipe.to('cuda')
text_prompt=["squirrel","elephant"]
# generate image
generator = torch.manual_seed(0)
new_image = pipe(
text_prompt,
num_inference_steps=20,
generator=generator,
image=image,
control_image=canny_image,
controlnet_conditioning_scale = 0.5,
mask_image=mask_image
).images
plt.subplot(1,2,1)
plt.imshow(new_image[0])
plt.subplot(1,2,2)
plt.imshow(new_image[1])
with the following result:
I don't know if this was your question or if you were asking about some more efficient way to compute this. That said, it is unlikely I will be making further changes in this respect, as I want this pipeline to resemble the diffusers
related pipelines (inpainting and controlnet) as much as possible. Open to discussion though!
Does this work with a standard GPU, T4
Yes
Also, in case you're looking for an example with multiple images and multiple prompts, you can actually do that by supplying:
from torchvision.transforms import ToTensor
img1=ToTensor()(image).unsqueeze_(0)
mask1=ToTensor()(mask_image).unsqueeze_(0)
canny1=ToTensor()(canny_image).unsqueeze_(0)
img2=torch.flip(img1,[-1]) mask2=torch.flip(mask1,[-1]) canny2=torch.flip(canny1,[-1])
img_stack=2*torch.cat([img1,img2],0)-1 # convert to [-1,+1] range mask_stack=torch.cat([mask1,mask2],0)[:,0,:,:] canny_stack=torch.cat([canny1,canny2],0)
>💡 In this example, the second image is the first image **horizontally flipped**
2. Pipeline
```python
text_prompt=["squirrel", "elephant"]
# generate image
generator = torch.manual_seed(0)
new_image = pipe(
text_prompt,
num_inference_steps=20,
generator=generator,
image=img_stack,
control_image=canny_stack,
controlnet_conditioning_scale = 0.5,
mask_image=mask_stack
).images
for idx,img in enumerate(new_image):
plt.subplot(1,len(new_image),1+idx)
plt.imshow(img)
Hello @mikonvergence, your work is awesome and i have a query regarding an issue which is rendering in my brain from days.
I have 10-15 different prompts and i want to infer on a single image, also with T4 GPU, the GPU goes into fragments for single image and single prompt.
Thanks and Regards, Satwik Sunnam.