Akegarasu / lora-scripts

LoRA & Dreambooth training scripts & GUI use kohya-ss's trainer, for diffusion model.
GNU Affero General Public License v3.0
4.58k stars 565 forks source link

Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device. #504

Closed asizk closed 2 months ago

asizk commented 2 months ago

代码在scripts/flux_train_network.py", line 238 加载t5模型时 出现报错text_encoders[1].to(accelerator.device, dtype=weight_dtype) NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.

我使用的是官方FLUX.1-dev模型 image

我看了#477问题 要使用FP16的模型 我把t5 bf16转为fp16 保存了新的模型还是报同样的错误 这是我的转换代码 import torch from safetensors.torch import load_file, save_file

model_file = "model.safetensors" model = load_file(model_file)

for key in model.keys(): model[key] = model[key].to(torch.float16)

output_file = "fp16_model.safetensors" save_file(model, output_file)

这是我训练的配置文件 pretrained_model_name_or_path = "/FLUX.1-dev/flux1-dev.sft" ae = "/FLUX.1-dev/ae.sft" clip_l = "/FLUX.1-dev/text_encoder/model.safetensors" t5xxl = "/FLUX.1-dev/text_encoder_2/model-00001-of-00002.safetensors" timestep_sampling = "sigmoid" sigmoid_scale = 1.0 model_prediction_type = "raw" discrete_flow_shift = 1.0 loss_type = "l2" guidance_scale = 1.0 train_data_dir = "/data/test" prior_loss_weight = 1 resolution = "1280,1280" enable_bucket = true min_bucket_reso = 1152 max_bucket_reso = 1408 bucket_reso_steps = 64 bucket_no_upscale = true output_name = "flux_v1" output_dir = "/data/save" save_model_as = "safetensors" save_precision = "bf16" save_every_n_epochs = 1 max_train_epochs = 100 train_batch_size = 2 gradient_checkpointing = true gradient_accumulation_steps = 1 network_train_unet_only = true network_train_text_encoder_only = false learning_rate = 0.0001 unet_lr = 0.0005 text_encoder_lr = 1e-5 lr_scheduler = "cosine_with_restarts" lr_warmup_steps = 0 lr_scheduler_num_cycles = 1 optimizer_type = "PagedAdamW8bit" network_module = "networks.lora_flux" network_dim = 256 network_alpha = 256 log_with = "tensorboard" logging_dir = "./logs" caption_extension = ".txt" shuffle_caption = true keep_tokens = 1 max_token_length = 255 seed = 1337 clip_skip = 2 mixed_precision = "bf16" full_fp16 = false full_bf16 = true fp8_base = false sdpa = true lowram = false cache_latents = true cache_latents_to_disk = true cache_text_encoder_outputs = false cache_text_encoder_outputs_to_disk = false persistent_data_loader_workers = true vae_batch_size = 4 ddp_timeout = 88888