XLabs-AI / x-flux

Apache License 2.0
1.63k stars 118 forks source link

TypeError: DoubleStreamBlockLoraProcessor.forward() missing 1 required positional argument: 'pe' #77

Closed MidnightRambo closed 2 months ago

MidnightRambo commented 3 months ago

I'm encountering an issue during training where the forward method in the DoubleStreamBlockLoraProcessor class throws a TypeError due to a missing argument. The error message is as follows:

TypeError: DoubleStreamBlockLoraProcessor.forward() missing 1 required positional argument: 'pe'

This occurs when the forward method is called without providing the pe argument, which is required for the function to execute correctly.

I'm running (or at least trying to) my training inside of runpod on an A40 with 48GB of ram.

My settings are the following: YAML File for LoRa:

data_config:
  train_batch_size: 1
  num_workers: 4
  img_size: 512
  img_dir: image_datasets/images/
report_to: wandb
train_batch_size: 1
output_dir: lora/
max_train_steps: 10000
learning_rate: 1e-5
lr_scheduler: constant
lr_warmup_steps: 10
adam_beta1: 0.9
adam_beta2: 0.999
adam_weight_decay: 0.01
adam_epsilon: 1e-8
max_grad_norm: 1.0
logging_dir: logs
mixed_precision: "bf16"
checkpointing_steps: 1000
checkpoints_total_limit: 11
tracker_project_name: lora_test
resume_from_checkpoint: latest
gradient_accumulation_steps: 2
rank: 16

accelerate config:

debug: false                                                                                               
deepspeed_config:
  gradient_accumulation_steps: 4
  gradient_clipping: 1.0
  offload_optimizer_device: none
  offload_param_device: none
  zero3_init_flag: false
  zero_stage: 2
distributed_type: DEEPSPEED
downcast_bf16: 'no'
enable_cpu_affinity: false
machine_rank: 0
main_training_function: main
mixed_precision: bf16
num_machines: 1
num_processes: 1
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false

Does anyone of you have an idea on how to solve this issue?

jiashenggu commented 2 months ago

https://github.com/XLabs-AI/x-flux/blob/9e1dd391b2316b1cfc20e523e2885fd30134a2e4/src/flux/model.py#L134 change to

for name, module in self.named_children():
    if name.startswith("double_blocks"):
        fn_recursive_attn_processor(name, module, processor)
stazizov commented 2 months ago

fixed, thank you