microsoft / MOFDiff

Coarse-grained Diffusion for Metal-Organic Framework Design
MIT License
37 stars 4 forks source link

Issue of training the building block encoder #20

Open FrancisFengbin opened 4 months ago

FrancisFengbin commented 4 months ago

Hello, I got an error that when running the python mofdiff/scripts/train.py --config-name=bb > ./testmilnew1.out . And the output is

pytorch_lightning.utilities.exceptions.MisconfigurationException: ReduceLROnPlateau conditioned on metric val_loss which is not available. Available metrics are: ['train_num_atom_loss', 'train_num_atom_loss_step', 'train_num_cp_loss', 'train_num_cp_loss_step', 'train_diameter_loss', 'train_diameter_loss_step', 'train_id_loss', 'train_id_loss_step', 'train_loss', 'train_loss_step', 'z_norm', 'z_norm_step', 'train_num_atom_loss_epoch', 'train_num_cp_loss_epoch', 'train_diameter_loss_epoch', 'train_id_loss_epoch', 'train_loss_epoch', 'z_norm_epoch']. Condition can be set using `monitor` key in lr scheduler dict

It seems that ReduceLROnPlateau relies on a loss metric from the validation set, such as val_loss. But I found # building block embedding space learning does not involve validation or testing. in bb_encoder.py. So what can I do to deal with this bug? Thanks!

lkny123 commented 4 months ago

Hi, I encountered the same problem. Under bb_encoder.py, try fixing the return value of def configure_optimizers as follows:

return { "optimizer": opt, "lr_scheduler": { "scheduler": scheduler, "strict": False, "monitor": "val_loss" }, }

FrancisFengbin commented 4 months ago

@lkny123 Thanks! I'll give it a try.