We now manually copy mu_readout.weight_infshape = mu_readout.weight.infshape after setting base shapes. This way, we can still access the infshape after FSDP-wrapping. Because this also requires using FSDP(..., use_orig_params=True), the README is accordingly adjusted to mention this caveat.
Fix #59. @edwardjhu is the review offer still up? :)
We now manually copy
mu_readout.weight_infshape = mu_readout.weight.infshape
after setting base shapes. This way, we can still access the infshape after FSDP-wrapping. Because this also requires usingFSDP(..., use_orig_params=True)
, the README is accordingly adjusted to mention this caveat.Fix #59. @edwardjhu is the review offer still up? :)