Closed yapengyu closed 3 weeks ago
The changes I made were for periodic validation during training, your error seems to be when training has not yet started, perhaps @PromeAIpro the original author is needed.Also in my experience you need to double check that the paths, types etc. of the various models match.
@PromeAIpro Hi,would you like check the bug? thx
Can you show the detailed parameters you used?
@PromeAIpro the parameters as following:
accelerate launch train_controlnet_flux.py \
--pretrained_model_name_or_path="black-forest-labs/FLUX.1-dev" \
--dataset_name=fusing/fill50k \
--conditioning_image_column=conditioning_image \
--image_column=image \
--caption_column=text \
--output_dir=$OUTPUT_DIR \
--mixed_precision="fp16" \
--resolution=512 \
--learning_rate=1e-5 \
--max_train_steps=15000 \
--validation_steps=100 \
--checkpointing_steps=200 \
--validation_image "./conditioning_image_1.png" "./conditioning_image_2.png" \
--validation_prompt "red circle with blue background" "cyan circle with brown floral background" \
--train_batch_size=1 \
--gradient_accumulation_steps=4 \
--num_double_layers=4 \
--num_single_layers=0 \
--seed=42 \
--cache_dir=$CACHE_DIR \
--max_train_samples=100
will take some time discussing what is going on this pr https://github.com/huggingface/diffusers/pull/9711 guess this cause your issue
latent_image_ids = FluxControlNetPipeline._prepare_latent_image_ids(
batch_size=pixel_latents_tmp.shape[0],
height=pixel_latents_tmp.shape[2] // 2,
width=pixel_latents_tmp.shape[3] // 2,
device=pixel_values.device,
dtype=pixel_values.dtype,
)
@PromeAIpro Hi,would you like check the bug? thx
checkout the train script to 0.31.0 may work and be quickest
will take some time discussing what is going on this pr #9711 guess this cause your issue
latent_image_ids = FluxControlNetPipeline._prepare_latent_image_ids( batch_size=pixel_latents_tmp.shape[0], height=pixel_latents_tmp.shape[2] // 2, width=pixel_latents_tmp.shape[3] // 2, device=pixel_values.device, dtype=pixel_values.dtype, )
oh, thx bro, commented "// 2" it works! thx greatly!
@PromeAIpro Hi,would you like check the bug? thx
checkout the train script to 0.31.0 may work and be quickest
oh, u'r right! thx~ In branch tag v0.31.0, the bug is fixed, while existing in the main branch
seems it is under developing, and training scripts fits with latest dev-branch, checkout to 0.31.0 (same with release) also make sense. See https://github.com/huggingface/diffusers/pull/9711
Describe the bug
trying to train flux controlnet, reference to 'train_controlnet_flux.py' and 'readme_flux.txt'
Reproduction
use the dataset 'fusing/fill50k', and the parameters mentioned in 'readme_flux.txt'
Logs
System Info
Who can help?
@ScilenceForest @sayakpaul