How to use fsdp with having some of the layers weight frozen?

Continuing from #258 When I freeze the weight and bias of output_proj of [ChromaStemConditioner](https://github.com/facebookresearch/audiocraft/blob/main/audiocraft/modules/conditioners.py#L509), and have fsdp.use = true, the error is raised,

ValueError: FlatParameter requires uniform requires_grad

Seems like as default, FSDP needs all the requires_grad values same.

But when constructing FSDP, if use_orig_params=True is passed to the FSDP constructor, then it is possible to have different requires_grad values.

But I found in the original audiocraft/optim/fsdp.py code, use_orig_params=True is already being passed to _FSDPFixStateDict.

Why is it not possible to have only some of the layers' weights frozen, even use_orig_params=True value is passed to the constuctor?

facebookresearch / audiocraft

How to use fsdp with having some of the layers weight frozen? #260