When I loaded the checkpoint of the transformer saved using the training script train_dreambooth_flux.py, I found it exactly the same as the pretrained flux-dev model. So I suspect that the model is not updating the parameters. Meanwhile, I notice that the optimizer.bin in the checkpoint save dir is very small, only 1.3K. This could be abnormal. The saved checkpoint works using the training script train_dreambooth_sd3.py. However, it fails with train_dreambooth_flux.py.
Reproduction
A testing script is like this:
import torch
from diffusers import FluxPipeline
from accelerate import Accelerator
import diffusers
from diffusers import (
AutoencoderKL,
FlowMatchEulerDiscreteScheduler,
FluxPipeline,
FluxTransformer2DModel,
)
transformer1 = FluxTransformer2DModel.from_pretrained(
"black-forest-labs/FLUX.1-dev", subfolder="transformer", torch_dtype=torch.bfloat16
)
transformer1.eval()
initial_params = {name: param.data.clone() for name, param in transformer1.named_parameters()}
# the folder that saves the checkpoint of the transformer using accelerator.save_state()
transformer_path = '/xxx/checkpoint-2/transformer'
transformer2 = FluxTransformer2DModel.from_pretrained(
transformer_path, torch_dtype=torch.bfloat16,
)
for name, param in transformer2.named_parameters():
if not torch.equal(initial_params[name], param.data):
print(name, ' not match')
Logs
Using the test script above, we can find that the saved transformer is exactly the same as the pretrained transformer.
Describe the bug
When I loaded the checkpoint of the transformer saved using the training script train_dreambooth_flux.py, I found it exactly the same as the pretrained flux-dev model. So I suspect that the model is not updating the parameters. Meanwhile, I notice that the optimizer.bin in the checkpoint save dir is very small, only 1.3K. This could be abnormal. The saved checkpoint works using the training script train_dreambooth_sd3.py. However, it fails with train_dreambooth_flux.py.
Reproduction
A testing script is like this:
Logs
System Info
Who can help?
@sayakpaul