Open blacklong28 opened 5 days ago
@Victor49152 Hi, Can you help me? I just used the config provided in the NeMo/examples/multimodal/text_to_image/stable_diffusion/conf and follow the tutorials. I also use the nemo docker to run sd_xl_infer.py,and is still noise output image.
Thanks for your post. Could you please check the log executing 'convert_hf_ckpt_to_nemo.py', I think you might see some unexpected keys and missing keys warning.
Some layer names might be changed in NeMo so the conversion script is not mapping the keys properly. Please let me know if that is the case, I will try to update the conversion script. Thanks.
Thanks for your reply. This is the log executing 'convert_hf_ckpt_to_nemo.py': convert_nemo_test.log I also saw some Missing and Unexpected keys in the SDXL Quantization.ipynb you provided. I thought they were normal, so I didn't pay much attention to them. Please help me to see if they are normal. Thank you.
This conversion script is obsolete. Can you try https://github.com/NVIDIA/NeMo/blob/main/scripts/checkpoint_converters/convert_stablediffusion_hf_to_nemo.py and https://github.com/NVIDIA/NeMo/blob/409f1d847ff53a66e56763da3a83e2980e9afe53/examples/multimodal/text_to_image/stable_diffusion/conf/sd_xl_infer_v2.yaml as the new inference config.
Let me know if these work for you. I will update the notebook later.
I use this script(https://github.com/NVIDIA/NeMo/blob/main/scripts/checkpoint_converters/convert_stablediffusion_hf_to_nemo.py)to convert a safetensors model to nemo.ckpt I notice that the saved model uses torch.save to save a ckpt model, is not a .nemo model. and use this config.
model_cfg.unet_config.from_pretrained = "/opt/NeMo/nemo_out/sdxl_base_new_test1023A_nemo.ckpt"
model_cfg.unet_config.from_NeMo = True
model_cfg.first_stage_config.from_pretrained = "/opt/NeMo/nemo_out/sdxl_vae_new_test1023A_nemo.ckpt"
model_cfg.first_stage_config.from_NeMo = True
python3 /opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/sd_xl_infer.py model.restore_from_path=/opt/NeMo/nemo_out/sdxl_base_new_test1023A_nemo.ckpt out_path=/opt/NeMo/infer_out
I got the error:
root@d392d2e1fa20:~# python3 /opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/sd_xl_infer.py model.restore_from_path=/opt/NeMo/nemo_out/sdxl_base_new_test1023A_nemo.ckpt out_path=/opt/NeMo/infer_out
[NeMo I 2024-10-23 03:19:48 utils:285] FSDP is False, using DDP strategy.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
[NeMo W 2024-10-23 03:19:48 utils:333] Loading from .ckpt checkpoint for inference is experimental! It doesn't support models with model parallelism!
Error executing job with overrides: ['model.restore_from_path=/opt/NeMo/nemo_out/sdxl_base_new_test1023A_nemo.ckpt', 'out_path=/opt/NeMo/infer_out']
Traceback (most recent call last):
File "/opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/sd_xl_infer.py", line 37, in main
trainer, megatron_diffusion_model = setup_trainer_and_model_for_inference(
File "/opt/NeMo/nemo/collections/multimodal/parts/utils.py", line 337, in setup_trainer_and_model_for_inference
model = model_provider.load_from_checkpoint(
File "/opt/NeMo/nemo/collections/nlp/models/nlp_model.py", line 385, in load_from_checkpoint
model = ptl_load_state(cls, checkpoint, strict=strict, cfg=cfg, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/core/saving.py", line 158, in _load_state
obj = cls(**_cls_kwargs)
File "/opt/NeMo/nemo/collections/multimodal/models/text_to_image/stable_diffusion/diffusion_engine.py", line 367, in __init__
super().__init__(cfg, trainer=trainer)
File "/opt/NeMo/nemo/collections/nlp/parts/mixins/nlp_adapter_mixins.py", line 88, in __init__
super().__init__(*args, **kwargs)
File "/opt/NeMo/nemo/collections/nlp/models/language_modeling/megatron_base_model.py", line 118, in __init__
with open_dict(cfg):
File "/usr/lib/python3.10/contextlib.py", line 135, in __enter__
return next(self.gen)
AttributeError: 'dict' object has no attribute '_get_node_flag'
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Are there any other parameters or code I need to change here ?
If the.nemo suffix model is used as the file name for saving the model, an error will be reported when loading sdxl_infer.
root@d392d2e1fa20:~# python3 /opt/NeMo/scripts/checkpoint_converters/convert_stablediffusion_hf_to_nemo.py --input_name_or_path /sdxl_ckpts/stable-diffusion-xl-base-1.0/unet/ --output_path /opt/NeMo/nemo_out/sdxl_base_new_test1023A.nemo --model unet --debug
[NeMo I 2024-10-23 02:22:15 convert_stablediffusion_hf_to_nemo:413] loading checkpoint /sdxl_ckpts/stable-diffusion-xl-base-1.0/unet/
[NeMo I 2024-10-23 02:22:15 convert_stablediffusion_hf_to_nemo:418] converting unet...
[NeMo I 2024-10-23 02:22:15 convert_stablediffusion_hf_to_nemo:268] Add embedding found...
[NeMo I 2024-10-23 02:22:15 convert_stablediffusion_hf_to_nemo:273] Time embedding found...
[NeMo I 2024-10-23 02:23:16 convert_stablediffusion_hf_to_nemo:447] Saved nemo file to /opt/NeMo/nemo_out/sdxl_base_new_test1023A.nemo
root@d392d2e1fa20:~# python3 /opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/sd_xl_infer.py model.restore_from_path=/opt/NeMo/nemo_out/sdxl_base_new_test1023A.nemo out_path=/opt/NeMo/infer_out
[NeMo I 2024-10-23 02:24:52 utils:285] FSDP is False, using DDP strategy.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
Error executing job with overrides: ['model.restore_from_path=/opt/NeMo/nemo_out/sdxl_base_new_test1023A.nemo', 'out_path=/opt/NeMo/infer_out']
Traceback (most recent call last):
File "/usr/lib/python3.10/tarfile.py", line 1870, in gzopen
t = cls.taropen(name, mode, fileobj, **kwargs)
File "/usr/lib/python3.10/tarfile.py", line 1847, in taropen
return cls(name, mode, fileobj, **kwargs)
File "/usr/lib/python3.10/tarfile.py", line 1707, in __init__
self.firstmember = self.next()
File "/usr/lib/python3.10/tarfile.py", line 2622, in next
raise e
File "/usr/lib/python3.10/tarfile.py", line 2595, in next
tarinfo = self.tarinfo.fromtarfile(self)
File "/usr/lib/python3.10/tarfile.py", line 1285, in fromtarfile
buf = tarfile.fileobj.read(BLOCKSIZE)
File "/usr/lib/python3.10/gzip.py", line 301, in read
return self._buffer.read(size)
File "/usr/lib/python3.10/_compression.py", line 68, in readinto
data = self.read(len(byte_view))
File "/usr/lib/python3.10/gzip.py", line 488, in read
if not self._read_gzip_header():
File "/usr/lib/python3.10/gzip.py", line 436, in _read_gzip_header
raise BadGzipFile('Not a gzipped file (%r)' % magic)
gzip.BadGzipFile: Not a gzipped file (b'PK')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/NeMo/examples/multimodal/text_to_image/stable_diffusion/sd_xl_infer.py", line 37, in main
trainer, megatron_diffusion_model = setup_trainer_and_model_for_inference(
File "/opt/NeMo/nemo/collections/multimodal/parts/utils.py", line 314, in setup_trainer_and_model_for_inference
model_cfg = model_provider.restore_from(
File "/opt/NeMo/nemo/collections/nlp/models/nlp_model.py", line 478, in restore_from
return super().restore_from(
File "/opt/NeMo/nemo/core/classes/modelPT.py", line 468, in restore_from
instance = cls._save_restore_connector.restore_from(
File "/opt/NeMo/nemo/collections/nlp/parts/nlp_overrides.py", line 1298, in restore_from
loaded_params = super().load_config_and_state_dict(
File "/opt/NeMo/nemo/core/connectors/save_restore_connector.py", line 148, in load_config_and_state_dict
members = self._filtered_tar_info(restore_path, filter_fn=filter_fn)
File "/opt/NeMo/nemo/core/connectors/save_restore_connector.py", line 622, in _filtered_tar_info
with SaveRestoreConnector._tar_open(tar_path) as tar:
File "/usr/lib/python3.10/contextlib.py", line 135, in __enter__
return next(self.gen)
File "/opt/NeMo/nemo/core/connectors/save_restore_connector.py", line 661, in _tar_open
tar = tarfile.open(path2file, tar_header)
File "/usr/lib/python3.10/tarfile.py", line 1817, in open
return func(name, filemode, fileobj, **kwargs)
File "/usr/lib/python3.10/tarfile.py", line 1874, in gzopen
raise ReadError("not a gzip file") from e
tarfile.ReadError: not a gzip file
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
Describe the bug
I follow the tutorial convert a model download from https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/unet/diffusion_pytorch_model.safetensors. and I only convert from pytorch model to nemo model. and infer the nemo model. but the output image is full of noise. Is there a bug or am I doing something wrong?
Steps/Code to reproduce bug
I just follow the tutorial . No quantization model is needed, the nemo model is converted, and the sd_xl_infer.py script is directly used for inference The same result can be deduced from quantified model
download model:
convert safetensors to nemo model
infer nemo model
Expected behavior
Expect to produce a normal image instead of all noise
Environment details
Additional context The output image: