Fix bugs & Add SDXL & Add more parameters for training

JeremyGuo commented 3 months ago

Add resolution parameter to Basic & Advanced
Add SDXL to LoRA Training in Comfy
Correct the script path for train_network.py/sdxl
- Logs Dir: logs dir to this project instead of the root folder of Comfy
- sd-scripts Dir: There is a sd-scripts folder in .git, which may cause some bugs.
Fix bug in sd-scrpts library subpackage not found. (on windows it not found)
Add gitignore
Add an example workflow(image) in assets：I didn't change the README.md.

More advice: Maybe models/lora is the root folder of ComfyUI, use ../../models/lora or the first parent folder that contains models/lora as the default is better.

TESTED: Basic and Advanced training in sdxl. SD2.0 and SD1.5 are not tested since I didn't change the relevant code, (see more in the commit).

Good day.

Shanghai-Bill commented 2 months ago

good stuff, pulled your PR no issues.

tuner562 commented 2 months ago

I would like to pull to for SDXL

rafaroeder commented 2 months ago

did not work :(

JeremyGuo commented 2 months ago

did not work :(

Please provide more information on what aspect, and the corresponding log.

"did not work" won't help you or others.

rafaroeder commented 2 months ago

did not work :(

Please provide more information on what aspect, and the corresponding log.

"did not work" won't help you or others.

Hey JeremyGuo, yes, you are right! sorry. I had been struggling for more than 8 hours straight. It worked pulling this PR and setting the "library" path manually on "train_network.py". I'm a bit of a beginner. Thank you for your response! And for this pr.

shalevc1098 commented 1 month ago


Traceback (most recent call last):
  File "F:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\Lora-Training-in-Comfy\sd-scripts\train_network.py", line 27, in <module>
    from library import model_util
ModuleNotFoundError: No module named 'library'
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "F:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\accelerate\commands\launch.py", line 996, in <module>
    main()
  File "F:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\accelerate\commands\launch.py", line 992, in main
    launch_command(args)
  File "F:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\accelerate\commands\launch.py", line 986, in launch_command
    simple_launcher(args)
  File "F:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\accelerate\commands\launch.py", line 628, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['F:\\AI\\ComfyUI_windows_portable\\python_embeded\\python.exe', 'F:/AI/ComfyUI_windows_portable/ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=F:\\AI\\ComfyUI_windows_portable\\ComfyUI\\models\\checkpoints\\meinamix_meinaV11.safetensors', '--train_data_dir=F:/AI/ComfyUI_windows_portable/data/Lucia_Nanami', '--output_dir=../../models/loras', '--logging_dir=./logs', '--log_prefix=lucia_nanami', '--resolution=512,512', '--network_module=networks.lora', '--max_train_epochs=10', '--learning_rate=1e-4', '--unet_lr=1e-4', '--text_encoder_lr=1e-5', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=0', '--lr_scheduler_num_cycles=1', '--network_dim=32', '--network_alpha=32', '--output_name=lucia_nanami', '--train_batch_size=1', '--save_every_n_epochs=10', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=24', '--cache_latents', '--prior_loss_weight=1', '--max_token_length=225', '--caption_extension=.txt', '--save_model_as=safetensors', '--min_bucket_reso=256', '--max_bucket_reso=1584', '--keep_tokens=0', '--xformers', '--shuffle_caption', '--clip_skip=2', '--optimizer_type=AdamW8bit', '--persistent_data_loader_workers', '--log_with=tensorboard', '--clip_skip=2', '--optimizer_type=AdamW8bit', '--persistent_data_loader_workers', '--log_with=tensorboard']' returned non-zero exit status 1.
Train finished

JeremyGuo commented 1 month ago

Traceback (most recent call last):
  File "F:\AI\ComfyUI_windows_portable\ComfyUI\custom_nodes\Lora-Training-in-Comfy\sd-scripts\train_network.py", line 27, in <module>
    from library import model_util
ModuleNotFoundError: No module named 'library'
Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "F:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\accelerate\commands\launch.py", line 996, in <module>
    main()
  File "F:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\accelerate\commands\launch.py", line 992, in main
    launch_command(args)
  File "F:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\accelerate\commands\launch.py", line 986, in launch_command
    simple_launcher(args)
  File "F:\AI\ComfyUI_windows_portable\python_embeded\Lib\site-packages\accelerate\commands\launch.py", line 628, in simple_launcher
    raise subprocess.CalledProcessError(returncode=process.returncode, cmd=cmd)
subprocess.CalledProcessError: Command '['F:\\AI\\ComfyUI_windows_portable\\python_embeded\\python.exe', 'F:/AI/ComfyUI_windows_portable/ComfyUI/custom_nodes/Lora-Training-in-Comfy/sd-scripts/train_network.py', '--enable_bucket', '--pretrained_model_name_or_path=F:\\AI\\ComfyUI_windows_portable\\ComfyUI\\models\\checkpoints\\meinamix_meinaV11.safetensors', '--train_data_dir=F:/AI/ComfyUI_windows_portable/data/Lucia_Nanami', '--output_dir=../../models/loras', '--logging_dir=./logs', '--log_prefix=lucia_nanami', '--resolution=512,512', '--network_module=networks.lora', '--max_train_epochs=10', '--learning_rate=1e-4', '--unet_lr=1e-4', '--text_encoder_lr=1e-5', '--lr_scheduler=cosine_with_restarts', '--lr_warmup_steps=0', '--lr_scheduler_num_cycles=1', '--network_dim=32', '--network_alpha=32', '--output_name=lucia_nanami', '--train_batch_size=1', '--save_every_n_epochs=10', '--mixed_precision=fp16', '--save_precision=fp16', '--seed=24', '--cache_latents', '--prior_loss_weight=1', '--max_token_length=225', '--caption_extension=.txt', '--save_model_as=safetensors', '--min_bucket_reso=256', '--max_bucket_reso=1584', '--keep_tokens=0', '--xformers', '--shuffle_caption', '--clip_skip=2', '--optimizer_type=AdamW8bit', '--persistent_data_loader_workers', '--log_with=tensorboard', '--clip_skip=2', '--optimizer_type=AdamW8bit', '--persistent_data_loader_workers', '--log_with=tensorboard']' returned non-zero exit status 1.
Train finished

Is the code of line 13 from sd-scripts\train_network.py sys.path.append(os.path.dirname(__file__)) ? Can you show me the result of os.path by adding a print to the next line ?

LarryJane491 / Lora-Training-in-Comfy

Fix bugs & Add SDXL & Add more parameters for training #52