Open solitaryTian opened 11 months ago
we met the same problem, bro
we met the same problem, bro
have you solved it?
we met the same problem, bro
have you solved it?
no,but i continue to train it, as the ckpt has been saved
we met the same problem, bro
have you solved it?
no,but i continue to train it, as the ckpt has been saved
Did your code forcefully stop without reporting an error?
we met the same problem, bro
have you solved it?
no,but i continue to train it, as the ckpt has been saved
Did your code forcefully stop without reporting an error?
it stop with the same error, but i check my output folder and found the ckpt had been saved, so i resume from the latest manually
we met the same problem, bro
have you solved it?
no,but i continue to train it, as the ckpt has been saved
Did your code forcefully stop without reporting an error?
it stop with the same error, but i check my output folder and found the ckpt had been saved, so i resume from the latest manually
Because every time you save the checkpoint, an error will be reported and forced to stop, so do you need to perform the operation you said every time an error is reported and stopped? Will this affect the accuracy of the final trained model?
we met the same problem, bro
have you solved it?
no,but i continue to train it, as the ckpt has been saved
Did your code forcefully stop without reporting an error?
it stop with the same error, but i check my output folder and found the ckpt had been saved, so i resume from the latest manually
Because every time you save the checkpoint, an error will be reported and forced to stop, so do you need to perform the operation you said every time an error is reported and stopped? Will this affect the accuracy of the final trained model?
yeah, i did perform the operation every 200 steps. but i can't answer ur latter question because i have problem with using the unet_lora i finally got. when I tried with the pipeline given on LCM HUGGINGFACE, I met some new error,maybe u will see the later,bro
The config attributes {'skip_prk_steps': True} were passed to LCMScheduler, but are not expected and will be ignored. Please verify your scheduler_config.json configuration file.
You have saved the LoRA weights using the old format. To convert the old LoRA weights to the new format, you can first load them in a dictionary and then create a new dictionary like the following: new_state_dict = {f'unet.{module_name}': params for module_name, params in old_state_dict.items()}
.
Traceback (most recent call last):
File "/workspace/LCM/test_run_sdxl_lora.py", line 12, in
I find that commenting 'pipeline.enable_xformers_memory_efficient_attention()' in log_validation method will work for lora training. However, the generated images are not good enough as it trained in train_lcm_distill_sd_wds.py. Does your lora training work?
I find that commenting 'pipeline.enable_xformers_memory_efficient_attention()' in log_validation method will work for lora training. However, the generated images are not good enough as it trained in train_lcm_distill_sd_wds.py. Does your lora training work?
You are right. I can solve this error by close 'enable_xformers_memory_efficient_attention()'. However, when I train a LCM sdxl lora, closing 'enable_xformers_memory_efficient_attention()' makes there is no sufficient memory for runing log_validation function to test the image quality generated by my trained lora on A100. Is there any other solutions without closing 'enable_xformers_memory_efficient_attention()' or other methods to save memory?
My lora training works.
I find that commenting 'pipeline.enable_xformers_memory_efficient_attention()' in log_validation method will work for lora training. However, the generated images are not good enough as it trained in train_lcm_distill_sd_wds.py. Does your lora training work?
You are right. I can solve this error by close 'enable_xformers_memory_efficient_attention()'. However, when I train a LCM sdxl lora, closing 'enable_xformers_memory_efficient_attention()' makes there is no sufficient memory for runing log_validation function to test the image quality generated by my trained lora on A100. Is there any other solutions without closing 'enable_xformers_memory_efficient_attention()' or other methods to save memory?
My lora training works.
is this photo from sdxl lora? what is the inference code u use
is this photo from sdxl lora? what is the inference code u use
No, it is from the lora of a fine-tuned version of sd1.5 (a private model of my company). This is just generated by the 'log_validation' function of the official code. Have you meet the insufficient memory error when training LCM-lora sdxl?
is this photo from sdxl lora? what is the inference code u use
No, it is from the lora of a fine-tuned version of sd1.5 (a private model of my company). This is just generated by the 'log_validation' function of the official code. Have you meet the insufficient memory error when training LCM-lora sdxl?
nope,i didn't meet insufficient memeroy. i trained it with a single A800. i mannuly continue to train sdxl lora every 200 steps,but the ckpt i got seems not suit diffusers's stand, so i got many bugs
is this photo from sdxl lora? what is the inference code u use
No, it is from the lora of a fine-tuned version of sd1.5 (a private model of my company). This is just generated by the 'log_validation' function of the official code. Have you meet the insufficient memory error when training LCM-lora sdxl?
I have no idea about memory. I use the standard train_lcm_distill_sd_lora_wds.py for SD1.5(runwayml/stable-diffusion-v1-5) lora training with our data. But I find the generated images are blurred with guidance_scale=7. And it not work when guidance_scale=1.0. What is your training config?
is this photo from sdxl lora? what is the inference code u use
No, it is from the lora of a fine-tuned version of sd1.5 (a private model of my company). This is just generated by the 'log_validation' function of the official code. Have you meet the insufficient memory error when training LCM-lora sdxl?
I have no idea about memory. I use the standard train_lcm_distill_sd_wds.py for SD1.5(runwayml/stable-diffusion-v1-5) lora training with our data. But I find the generated images are blurred with guidance_scale=7. And it not work when guidance_scale=1.0. What is your training config?
maybe ur trainning data has some peobelm, i tried laion-art and it is blurred too, change dataset and it works
is this photo from sdxl lora? what is the inference code u use
No, it is from the lora of a fine-tuned version of sd1.5 (a private model of my company). This is just generated by the 'log_validation' function of the official code. Have you meet the insufficient memory error when training LCM-lora sdxl?
I have no idea about memory. I use the standard train_lcm_distill_sd_wds.py for SD1.5(runwayml/stable-diffusion-v1-5) lora training with our data. But I find the generated images are blurred with guidance_scale=7. And it not work when guidance_scale=1.0. What is your training config?
maybe ur trainning data has some peobelm, i tried laion-art and it is blurred too, change dataset and it works
But the same data, in SD1.5, lcm works, but lora-lcm does not work....
is this photo from sdxl lora? what is the inference code u use
No, it is from the lora of a fine-tuned version of sd1.5 (a private model of my company). This is just generated by the 'log_validation' function of the official code. Have you meet the insufficient memory error when training LCM-lora sdxl?
nope,i didn't meet insufficient memeroy. i trained it with a single A800. i mannuly continue to train sdxl lora every 200 steps,but the ckpt i got seems not suit diffusers's stand, so i got many bugs
You can try it like this. https://github.com/luosiallen/latent-consistency-model/issues/57
is this photo from sdxl lora? what is the inference code u use
No, it is from the lora of a fine-tuned version of sd1.5 (a private model of my company). This is just generated by the 'log_validation' function of the official code. Have you meet the insufficient memory error when training LCM-lora sdxl?
nope,i didn't meet insufficient memeroy. i trained it with a single A800. i mannuly continue to train sdxl lora every 200 steps,but the ckpt i got seems not suit diffusers's stand, so i got many bugs
You can try it like this. #57
thanks,i will check it
is this photo from sdxl lora? what is the inference code u use
No, it is from the lora of a fine-tuned version of sd1.5 (a private model of my company). This is just generated by the 'log_validation' function of the official code. Have you meet the insufficient memory error when training LCM-lora sdxl?
I have no idea about memory. I use the standard train_lcm_distill_sd_lora_wds.py for SD1.5(runwayml/stable-diffusion-v1-5) lora training with our data. But I find the generated images are blurred with guidance_scale=7. And it not work when guidance_scale=1.0. What is your training config?
I trained SDXL LORA, but the result shows that I have someting done wrong . what is ur blurred looks like?
is this photo from sdxl lora? what is the inference code u use
No, it is from the lora of a fine-tuned version of sd1.5 (a private model of my company). This is just generated by the 'log_validation' function of the official code. Have you meet the insufficient memory error when training LCM-lora sdxl?
I have no idea about memory. I use the standard train_lcm_distill_sd_lora_wds.py for SD1.5(runwayml/stable-diffusion-v1-5) lora training with our data. But I find the generated images are blurred with guidance_scale=7. And it not work when guidance_scale=1.0. What is your training config?
I trained SDXL LORA, but the result shows that I have someting done wrong . what is ur blurred looks like?
It looks similar to what I trained on SD20-base lora. You can change the guidance_scale, I find that guidance_scale=7 is better but also blur. (ps: my training steps are very very small, I am not sure about big steps)
is this photo from sdxl lora? what is the inference code u use
No, it is from the lora of a fine-tuned version of sd1.5 (a private model of my company). This is just generated by the 'log_validation' function of the official code. Have you meet the insufficient memory error when training LCM-lora sdxl?
I have no idea about memory. I use the standard train_lcm_distill_sd_lora_wds.py for SD1.5(runwayml/stable-diffusion-v1-5) lora training with our data. But I find the generated images are blurred with guidance_scale=7. And it not work when guidance_scale=1.0. What is your training config?
I trained SDXL LORA, but the result shows that I have someting done wrong . what is ur blurred looks like?
It looks similar to what I trained on SD20-base lora. You can change the guidance_scale, I find that guidance_scale=7 is better but also blur. (ps: my training steps are very very small, I am not sure about big steps)
yeah, my model is sdxl base1.0, other setting are completely the same with readme.md, so sad the result are so blurred
is this photo from sdxl lora? what is the inference code u use
No, it is from the lora of a fine-tuned version of sd1.5 (a private model of my company). This is just generated by the 'log_validation' function of the official code. Have you meet the insufficient memory error when training LCM-lora sdxl?
I have no idea about memory. I use the standard train_lcm_distill_sd_wds.py for SD1.5(runwayml/stable-diffusion-v1-5) lora training with our data. But I find the generated images are blurred with guidance_scale=7. And it not work when guidance_scale=1.0. What is your training config?
maybe ur trainning data has some peobelm, i tried laion-art and it is blurred too, change dataset and it works
But the same data, in SD1.5, lcm works, but lora-lcm does not work....
your are right.... I tried with SD distill, it works well. but when i tried to use lora, no matter sd1.5 or sdxl, it doesn't work......
t works well. but when i tried to use lora, no matter sd1.5 or sdxl, it doesn't work......
SD1.5 lora works well~ I find that when the lora training is up to 5000 steps, it can generate pictures well. You can try it with longer training steps. This is different to SD distill.
t works well. but when i tried to use lora, no matter sd1.5 or sdxl, it doesn't work......
SD1.5 lora works well~ I find that when the lora training is up to 5000 steps, it can generate pictures well. You can try it with longer training steps. This is different to SD distill.
could u please tell me what is the inference steps and guidance_scale u use?
t works well. but when i tried to use lora, no matter sd1.5 or sdxl, it doesn't work......
SD1.5 lora works well~ I find that when the lora training is up to 5000 steps, it can generate pictures well. You can try it with longer training steps. This is different to SD distill.
could u please tell me what is the inference steps and guidance_scale u use?
For lora SD15 testing, we use guidance_scale=1, inference steps 1,4,8 works. For training, we test different guidance_scale and lr, it works well with longer training steps with our own dataset. (Also has some bad cases, but it generates pictures).
t works well. but when i tried to use lora, no matter sd1.5 or sdxl, it doesn't work......
SD1.5 lora works well~ I find that when the lora training is up to 5000 steps, it can generate pictures well. You can try it with longer training steps. This is different to SD distill.
could u please tell me what is the inference steps and guidance_scale u use?
For lora SD15 testing, we use guidance_scale=1, inference steps 1,4,8 works. For training, we test different guidance_scale and lr, it works well with longer training steps with our own dataset. (Also has some bad cases, but it generates pictures).
yes,i tried guidance_scale like 2 3 4,it work well, but when i tried like 8, it is very bad ,the output photo contains many weird lines
t works well. but when i tried to use lora, no matter sd1.5 or sdxl, it doesn't work......
SD1.5 lora works well~ I find that when the lora training is up to 5000 steps, it can generate pictures well. You can try it with longer training steps. This is different to SD distill.
could u please tell me what is the inference steps and guidance_scale u use?
For lora SD15 testing, we use guidance_scale=1, inference steps 1,4,8 works. For training, we test different guidance_scale and lr, it works well with longer training steps with our own dataset. (Also has some bad cases, but it generates pictures).
yes,i tried guidance_scale like 2 3 4,it work well, but when i tried like 8, it is very bad ,the output photo contains many weird lines
by the way , here i used SDXL 1.0 BASE LORA
t works well. but when i tried to use lora, no matter sd1.5 or sdxl, it doesn't work......
SD1.5 lora works well~ I find that when the lora training is up to 5000 steps, it can generate pictures well. You can try it with longer training steps. This is different to SD distill.
could u please tell me what is the inference steps and guidance_scale u use?
For lora SD15 testing, we use guidance_scale=1, inference steps 1,4,8 works. For training, we test different guidance_scale and lr, it works well with longer training steps with our own dataset. (Also has some bad cases, but it generates pictures).
yes,i tried guidance_scale like 2 3 4,it work well, but when i tried like 8, it is very bad ,the output photo contains many weird lines
by the way , here i used SDXL 1.0 BASE LORA
The XL lora in huggingface should work well with 8 steps. You mean the model you trained? I think the bad outputs may be caused by dataset, or less training, or w_max=15 (use small)? You can email to dcfucheng@hotmail.com for discussion.
t works well. but when i tried to use lora, no matter sd1.5 or sdxl, it doesn't work......
SD1.5 lora works well~ I find that when the lora training is up to 5000 steps, it can generate pictures well. You can try it with longer training steps. This is different to SD distill.
could u please tell me what is the inference steps and guidance_scale u use?
For lora SD15 testing, we use guidance_scale=1, inference steps 1,4,8 works. For training, we test different guidance_scale and lr, it works well with longer training steps with our own dataset. (Also has some bad cases, but it generates pictures).
yes,i tried guidance_scale like 2 3 4,it work well, but when i tried like 8, it is very bad ,the output photo contains many weird lines
by the way , here i used SDXL 1.0 BASE LORA
The XL lora in huggingface should work well with 8 steps. You mean the model you trained? I think the bad outputs may be caused by dataset, or less training, or w_max=15 (use small)? You can email to dcfucheng@hotmail.com for discussion.
thanks,i have gotten good result in sdxl base lora, steps=8 guidacne_scale=2.0.. i just wonder why when guidance_scale =8 ,the output of lcm lora is so bad, worse than other models~~ anyway, thanks for ur help
Traceback (most recent call last): File "/dfs/comicai/songtao.tian/latent-consistency-model-main/LCM_Training_Script/consistency_distillation/./train_lcm_distill_lora_sd_wds.py", line 1378, in
main(args)
File "/dfs/comicai/songtao.tian/latent-consistency-model-main/LCM_Training_Script/consistency_distillation/./train_lcm_distill_lora_sd_wds.py", line 1356, in main
log_validation(vae, unet, args, accelerator, weight_dtype, global_step)
File "/dfs/comicai/songtao.tian/latent-consistency-model-main/LCM_Training_Script/consistency_distillation/./train_lcm_distill_lora_sd_wds.py", line 331, in log_validation
images = pipeline(
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, kwargs)
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py", line 918, in call
noise_pred = self.unet(
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(args, kwargs)
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/diffusers/models/unet_2d_condition.py", line 1075, in forward
sample, res_samples = downsample_block(
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, kwargs)
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, *kwargs)
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/diffusers/models/unet_2d_blocks.py", line 1160, in forward
hidden_states = attn(
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(args, kwargs)
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, kwargs)
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/diffusers/models/transformer_2d.py", line 392, in forward
hidden_states = block(
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(args, kwargs)
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/diffusers/models/attention.py", line 323, in forward
attn_output = self.attn2(
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, *kwargs)
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(args, **kwargs)
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/diffusers/models/attention_processor.py", line 522, in forward
return self.processor(
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/diffusers/models/attention_processor.py", line 1142, in call
hidden_states = xformers.ops.memory_efficient_attention(
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/xformers/ops/fmha/init.py", line 223, in memory_efficient_attention
return _memory_efficient_attention(
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/xformers/ops/fmha/init.py", line 321, in _memory_efficient_attention
return _memory_efficient_attention_forward(
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/xformers/ops/fmha/init.py", line 334, in _memory_efficient_attention_forward
inp.validate_inputs()
File "/root/miniconda3/envs/LCM/lib/python3.9/site-packages/xformers/ops/fmha/common.py", line 121, in validate_inputs
raise ValueError(
ValueError: Query/Key/Value should either all have the same dtype, or (in the quantized case) Key/Value should have dtype torch.int32
query.dtype: torch.float32
key.dtype : torch.float16
value.dtype: torch.float16