bmaltais / kohya_ss

Apache License 2.0
9.63k stars 1.24k forks source link

Which flux model, UNet? or integrated model? #2743

Open carat-keeeehun opened 2 months ago

carat-keeeehun commented 2 months ago

When training lora of flux-dev model, I don't know what model is needed as base model of flux-dev lora. I have 2 kinds of flux-dev model, first is UNet model, which is saved in ComfyUI/models/unet directory, and second is integrated model, which is saved in ComfyUI/models/checkpoints directory.

If I use UNet model as base model of flux-dev lora training, there is no problem in training process, but generated image quality is not fine. While, if I use integrated model as base model, I encountered below error: NotImplementedError: Cannot copy out of meta tensor; no data!

Traceback (most recent call last):
  File "/home/comfyui/kohya_ss-flux/sd-scripts/flux_train_network.py", line 408, in <module>
    trainer.train(args)
  File "/home/comfyui/kohya_ss-flux/sd-scripts/train_network.py", line 389, in train
    self.cache_text_encoder_outputs_if_needed(args, accelerator, unet, vae, text_encoders, train_dataset_group, weight_dtype)
  File "/home/comfyui/kohya_ss-flux/sd-scripts/flux_train_network.py", line 158, in cache_text_encoder_outputs_if_needed
    unet.to("cpu")
  File "/home/comfyui/miniconda3/envs/kohya/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1160, in to
    return self._apply(convert)
  File "/home/comfyui/miniconda3/envs/kohya/lib/python3.10/site-packages/torch/nn/modules/module.py", line 810, in _apply
    module._apply(fn)
  File "/home/comfyui/miniconda3/envs/kohya/lib/python3.10/site-packages/torch/nn/modules/module.py", line 833, in _apply
    param_applied = fn(param)
  File "/home/comfyui/miniconda3/envs/kohya/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1158, in convert
    return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
NotImplementedError: Cannot copy out of meta tensor; no data!
Serkad commented 2 months ago

The same problem, have you found a solution?

JimikoSK commented 2 months ago

I think either one honestly.

The trainer specifies in the debug log, "train unet only: true" so it should just train based on the unet in the model file, but I might be mistaken.