fsdp.md - needs updating to accommodate diffusers models?

Hi,

In the FSDP docs it says:

When using transformers save_pretrained, pass state_dict=accelerator.get_state_dict(model) to save the model state dict. Below is an example:

  unwrapped_model.save_pretrained(
      args.output_dir,
      is_main_process=accelerator.is_main_process,
      save_function=accelerator.save,
+     state_dict=accelerator.get_state_dict(model),
)

In diffusers (I can't speak for transformers), for anything which implements the ModelMixin class the method save_pretrained actually doesn't support passing in a custom state dict. While save_pretrained does take **kwargs it specifically is for kwargs to be passed into push_to_hub:

https://github.com/huggingface/diffusers/blob/8cdcdd9e32925200ce5e1cf410fe14a774f3c3a6/src/diffusers/models/modeling_utils.py#L266-L275

It is probably worth modifying the readme to say that in the case of diffusers you might be better off doing something like:

from safetensors.torch import save_file
model.save_config(save_dir) # save_config is from ConfigMixin
save_file(accelerator.get_state_dict(model), save_dir)

i.e., we have to be a bit hacky and save the state dict ourselves as long as the config file. (There may be a more optimal solution but I'm not a wizard at this.)

Any thoughts? Thanks.

huggingface / accelerate

fsdp.md - needs updating to accommodate diffusers models? #3089