SUDO-AI-3D / zero123plus

Code repository for Zero123++: a Single Image to Consistent Multi-view Diffusion Base Model.
Apache License 2.0
1.56k stars 108 forks source link

Increasing the image size in the diffuser pipeline causes the output image to collapse #50

Closed Fukuro99 closed 6 months ago

Fukuro99 commented 7 months ago

The default size for ControlNet images is 640x960, but this makes each image coarse. So we changed the image to be able to output at 1024x1536 and the image was out of order.

Image output at 640x960 output Image output at 1024x1536 output

Is there any way to increase the size of the output image?

Here is the code we used

import torch
import requests
from PIL import Image
from diffusers import DiffusionPipeline, EulerAncestralDiscreteScheduler,ControlNetModel
import rembg

# Load the pipeline
pipeline = DiffusionPipeline.from_pretrained(
    "sudo-ai/zero123plus-v1.1", custom_pipeline="sudo-ai/zero123plus-pipeline",
    torch_dtype=torch.float16
)
pipeline.add_controlnet(ControlNetModel.from_pretrained(
    "sudo-ai/controlnet-zp11-depth-v1", torch_dtype=torch.float16
), conditioning_scale=0.75)
# Feel free to tune the scheduler
pipeline.scheduler = EulerAncestralDiscreteScheduler.from_config(
    pipeline.scheduler.config, timestep_spacing='trailing'
)
pipeline.to('cuda:0')
# Run the pipeline
cond = Image.open(requests.get("https://d.skis.ltd/nrp/sample-data/0_cond.png", stream=True).raw)
depth = Image.open(requests.get("https://d.skis.ltd/nrp/sample-data/0_depth.png", stream=True).raw)

cond = cond.resize((512, 512))
depth = depth.resize((1024,1536))
result = pipeline(cond,width = 1024,height=1536,depth_image=depth, num_inference_steps=28).images[0]
#result = pipeline(cond,depth_image=depth, num_inference_steps=28).images[0]
result.show()
result.save("output.png")
eliphatfs commented 7 months ago

We didn't expect the current model to work under higher resolutions due to the technology we apply; we are developing a model that does, but that takes time.