microsoft / MOFDiff

Coarse-grained Diffusion for Metal-Organic Framework Design
MIT License
37 stars 4 forks source link

[QUESTION/ISSUE] Converting trained bb_encoder .ckpt to a .pt file for optimizing/sampling #24

Closed Rebell-Leader closed 1 week ago

Rebell-Leader commented 1 week ago

Good day! First of all, thanks for this interesting and promising publication! I've trained both the bb_encoder and diffusion model using a new dataset (to a single optimization parameter) but am now struggling to sample some new optimized structures using the bb_encoder checkpoint I got. In a pretrained .pt file (bb_emb_space.pt), there are only two tensors, and these tensors unpack normally as follows: all_data, all_z = torch.load(args.bb_cache_path)

But in my output dir after successfull bb encoder training, I have only these files: 'epoch=35-val_loss=-28.56.ckpt' hparams.yaml last.ckpt train.log type_mapper.pt .ckpt. In last.ckpt and 'epoch=35-val_loss=-28.56.ckpt' there are a lot of keys (that is a normal model training checkpoint dict, surely): dict_keys(['epoch', 'global_step', 'pytorch-lightning_version', 'state_dict', 'loops', 'callbacks', 'optimizer_states', 'lr_schedulers', 'hparams_name', 'hyper_parameters']) How do I use it, converting to a needed .pt structure of the two tensors? Have you used some specific script to stack the states to a tensor? Or you've just extracted the encoder part of bb, GemNetOCEncoder, to a .pt? But then where is the second tensor for all_z?

kyonofx commented 1 week ago

Hi,

the equivalent bb_emb_space.pt file is saved when you train a MOF diffusion model with a certain BB encoder:

https://github.com/microsoft/MOFDiff/blob/main/mofdiff/data/dataset.py#L475

Rebell-Leader commented 1 week ago

Thank you so much, now everything runs perfectly!