threestudio-project / threestudio

A unified framework for 3D content generation.
Apache License 2.0
6.33k stars 480 forks source link

weird performance on prolificdreamer #110

Closed Sheldonmao closed 1 year ago

Sheldonmao commented 1 year ago

Thank you for the great implementation! I am trying the demo code for prolificdreamer but getting weird performance during training. The log shows loss_z_variance quickly explodes to NaN before 100 steps and the evaluation result is blank. I wonder what might cause the problem.

The script I am using:

python launch.py --config configs/prolificdreamer.yaml --train --gpu 0 system.prompt_processor.prompt="a pineapple" data.width=64 data.height=64

tensorboard

evaluation at 50 steps

it50-0

evaluation at 100 steps

it100-0

I am using the commit here. Due to connection problem, I have to manually download and upload the pretrained weights for stable-diffusion-2-1-base and stable-diffusion-2-1. Will that cause the problem if the pretrained weight is not correctly assigned?

DSaurus commented 1 year ago

Hi,

You can use the following config and code to test your diffusion model.

system:
  prompt_processor_type: "stable-diffusion-prompt-processor"
  prompt_processor:
    pretrained_model_name_or_path: "stabilityai/stable-diffusion-2-1-base"
    prompt: "A cute panda"
    front_threshold: 30.
    back_threshold: 30.

  guidance_type: "stable-diffusion-vsd-guidance"
  guidance:
    pretrained_model_name_or_path: "stabilityai/stable-diffusion-2-1-base"
    pretrained_model_name_or_path_lora: "stabilityai/stable-diffusion-2-1"
    guidance_scale: 7.5
    min_step_percent: 0.02
    max_step_percent: 0.98
    max_step_percent_annealed: 0.5
    anneal_start_step: 5000
if __name__ == '__main__':
    from threestudio.utils.config import load_config
    import numpy as np
    import os
    import cv2
    cfg = load_config("configs/experimental/stablediffusion.yaml")
    guidance = threestudio.find(cfg.system.guidance_type)(cfg.system.guidance)
    prompt_processor = threestudio.find(cfg.system.prompt_processor_type)(cfg.system.prompt_processor)
    prompt_utils = prompt_processor()
    temp = torch.zeros(1).to(guidance.device)
    rgb_image = guidance.sample(prompt_utils, temp, temp, temp)
    rgb_image = (rgb_image[0].detach().cpu().clip(0, 1).numpy()*255).astype(np.uint8)[:, :, ::-1].copy()
    os.makedirs('.threestudio_cache', exist_ok=True)
    cv2.imwrite('.threestudio_cache/diffusion_image.jpg', rgb_image)

I can obtain the following image: diffusion_image

bennyguo commented 1 year ago

Hi @Sheldonmao, loss_z_variance is not necessary for object generation (I think I already removed it in the latest version). You could set system.loss.lambda_z_variance=0 and try again.

Sheldonmao commented 1 year ago

Hi,

You can use the following config and code to test your diffusion model.

system:
  prompt_processor_type: "stable-diffusion-prompt-processor"
  prompt_processor:
    pretrained_model_name_or_path: "stabilityai/stable-diffusion-2-1-base"
    prompt: "A cute panda"
    front_threshold: 30.
    back_threshold: 30.

  guidance_type: "stable-diffusion-vsd-guidance"
  guidance:
    pretrained_model_name_or_path: "stabilityai/stable-diffusion-2-1-base"
    pretrained_model_name_or_path_lora: "stabilityai/stable-diffusion-2-1"
    guidance_scale: 7.5
    min_step_percent: 0.02
    max_step_percent: 0.98
    max_step_percent_annealed: 0.5
    anneal_start_step: 5000
if __name__ == '__main__':
    from threestudio.utils.config import load_config
    import numpy as np
    import os
    import cv2
    cfg = load_config("configs/experimental/stablediffusion.yaml")
    guidance = threestudio.find(cfg.system.guidance_type)(cfg.system.guidance)
    prompt_processor = threestudio.find(cfg.system.prompt_processor_type)(cfg.system.prompt_processor)
    prompt_utils = prompt_processor()
    temp = torch.zeros(1).to(guidance.device)
    rgb_image = guidance.sample(prompt_utils, temp, temp, temp)
    rgb_image = (rgb_image[0].detach().cpu().clip(0, 1).numpy()*255).astype(np.uint8)[:, :, ::-1].copy()
    os.makedirs('.threestudio_cache', exist_ok=True)
    cv2.imwrite('.threestudio_cache/diffusion_image.jpg', rgb_image)

I can obtain the following image: diffusion_image

Thanks @DSaurus, I am getting a black image using the test code. I think this is indeed causing the bug. I will try updating the pre-trained models again and see if the problem can be resolved.

Sheldonmao commented 1 year ago

g a black image using the test code. I think this is indeed causing the bug. I will try updating the pre-trained models again and see if the problem can be res

Thank you for your advise @bennyguo, I have checked the config file and it did set lambda_z_variance=0. I think my bug is indeed related to wrong diffusion model as @DSaurus suggested.

Sheldonmao commented 1 year ago

Problem solved by downloading and configure the correct diffusion models for stable-diffusion-2-1-base and stable-diffusion-2-1. Probably the previous diffusion weights are not correctly loaded by the models. Thanks a lot to https://github.com/threestudio-project/threestudio/issues/110#issuecomment-1580488849.