Closed mcl0 closed 2 months ago
Do you know how to split the checkpoint.pth to dec_48b.pth from my own trained model in 'hidden'?
Hi,
@mcl0 The code for the whitening layer is in https://github.com/facebookresearch/stable_signature/blob/main/finetune_ldm_decoder.py#L112 If the ckpt is not whitened, it will create a new ckpt and save it.
@asdcaszc the ckpt that you obtain after training with the code of hidden
will also have the optimizer states. I think you have to load the state dict, then take the "encoder_decoder" key.
ckpt = torch.load(ckpt_path, map_location="cpu")
ckpt = ckpt["encoder_decoder"]
torch.save(ckpt, "new_model.pth")
@pierrefdz Thanks a lot for your reply!😄 I'm sorry I missed that code earlier.
Hi, thank you for your amazing work! I'm still a little confused how the whitened extractor checkpoint is obtained. From the appendix of the paper, I think the linear layer used for whitening should be added to the last layer of the original extractor. But I'm still at a loss how to implement that. I would really appreciate if you could help me know how to write a script to convert the provided dec_48b.pth to dec_48b_whit.torchscript.pt.
Thanks a lot!