threestudio-project / threestudio

A unified framework for 3D content generation.
Apache License 2.0
6.32k stars 480 forks source link

Stable Diffusion XL #296

Open benquick123 opened 1 year ago

benquick123 commented 1 year ago

Hi, has anyone been experimenting with Stable Diffusion XL? I've tried a trivial solution of changing the dreamvision-sd config file to point to the SDXL weights, but I reckon that the loading of weights is somehow changed in addition to the model weights themselves.

For example, I get the following warnings when loading SDXL in the stable_diffusion_guidance.py file:

The config attributes {'add_watermarker': None} were passed to StableDiffusionXLPipeline, but are not expected and will be ignored. Please verify your model_index.json configuration file.
Keyword arguments {'add_watermarker': None, 'safety_checker': None, 'feature_extractor': None, 'requires_safety_checker': False} are not expected by StableDiffusionXLPipeline and will be ignored.
The config attributes {'force_upcast': True} were passed to AutoencoderKL, but are not expected and will be ignored. Please verify your config.json configuration file.

And an error upon beggining the iterations:

{...}
  File "/home/latent/threestudio/threestudio/models/guidance/stable_diffusion_guidance.py", line 415, in __call__
    grad, guidance_eval_utils = self.compute_grad_sds(
  File "/home/latent/threestudio/threestudio/models/guidance/stable_diffusion_guidance.py", line 242, in compute_grad_sds
    noise_pred = self.forward_unet(
  File "/home/latent/miniconda3/envs/threestudio/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 14, in decorate_autocast
    return func(*args, **kwargs)
  File "/home/latent/threestudio/threestudio/models/guidance/stable_diffusion_guidance.py", line 154, in forward_unet
    return self.unet(
  File "/home/latent/miniconda3/envs/threestudio/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/latent/miniconda3/envs/threestudio/lib/python3.10/site-packages/diffusers/models/unet_2d_condition.py", line 839, in forward
    if "text_embeds" not in added_cond_kwargs:
TypeError: argument of type 'NoneType' is not iterable

Any help will be greatly appreciated.

mdarhdarz commented 1 year ago

yes, I have done sufficient experiments with SDXL successfully. Maybe I will share this if threestudio does not try SDXL.

thuliu-yt16 commented 1 year ago

@mdarhdarz It would be great if you could share your implementation!

mdarhdarz commented 1 year ago

Some basic suggestions:

  1. Use fp16 fixed vae
  2. Prepare gpus with more than 32G vram
  3. Text encoder and text embeddings are different from sd1.5, and unet forward needs more parameters. The pipeline implementation in diffusers is a good reference.

When you can run with SDXL, you still may not be able to get a good 3d generation result. I will write this part down and find a way to make it public. BTW, my implementation is based on Stable-dreamfusion 2022/12 version and I made a lot of modifications in the past year.

YG256Li commented 1 year ago

@mdarhdarz Great job! I also tried SDXL in threestudio, but I found that even with FP6-Fixed-VAE, there is still a divergence phenomenon during 3D generation. I suspect that SDXL's VAE is not very capable of gradient backward. I don't know how you solved it.

lzqsd commented 1 year ago

@YG256Li Yeah I also facing the same issue where the 3D shape cannot converge when switching to SDXL. Moreover, when I use fp16 precision, sometimes the VAE will output nan value. Would you mind sharing how to fix these issues? Thanks a lot! @mdarhdarz

mdarhdarz commented 1 year ago

@YG256Li Yeah I also facing the same issue where the 3D shape cannot converge when switching to SDXL. Moreover, when I use fp16 precision, sometimes the VAE will output nan value. Would you mind sharing how to fix these issues? Thanks a lot! @mdarhdarz

I'm also waiting...

g-l-i-t-c-h-o-r-s-e commented 8 months ago

Hope to see more of this one day :D

mdarhdarz commented 8 months ago

Hope to see more of this one day :D

@g-l-i-t-c-h-o-r-s-e Here: https://github.com/fudan-zvg/PGC-3D