cyye001 / Con-CDVAE

MIT License
17 stars 5 forks source link

how to make train,test,val pt ?? #7

Open youngseoh opened 1 week ago

youngseoh commented 1 week ago

how to make train, test, val pt??? I have obtained the MP20 dataset through an API, but I don't know how to convert it into a .pt file

cyye001 commented 1 week ago

When you use the dataset for the first time, the code will read the CSV file and store the processed data in the same path as xxx.pt files.

youngseoh commented 1 week ago

Thank you for your response; the issue has been resolved. However, when I run the model with the material project data I downloaded, the atom_type_probs keeps generating negative values, which is disrupting the training process. If possible, could you personally share the CSV or .pt files you used for train, test, and validation in the material project? Here’s my email: youngseoh6@gmail.com. I would greatly appreciate any help. Thank you very much!

cyye001 commented 1 week ago

This error may not be caused by the dataset. You could try using the dataset provided by Tian Xie—I initially used this dataset for testing as well. https://github.com/txie-93/cdvae/tree/main/data/mp_20

If possible, could you show me the detailed error message you’re encountering?

youngseoh commented 1 week ago

I also downloaded the train, test, and val datasets from this link and used them in my project. However, when I try to run the training, the following error occurs: image image

cyye001 commented 1 week ago

In fact, what you’re encountering is likely not a negative value but rather a NaN.

I think it is caused by exploding gradients. You can avoid this by setting gradient clipping and reducing the learning rate. If you want to set gradient clipping, you can set like this: python concdvae/run.py train=new data=mptest expname=test model=vae_mp_format_gap train.PT_train.clip_grad_norm=0.001 If you want to change learning rate, you can find it in conf/optim/default.yaml.

By the way, another study (cond-CDVAE) found that adding a tanh function after the encoder can effectively improve this situation. For example you can try add 'energy = torch.tanh(energy)' before 'return energy' in line 426 of concdvae/pl_modules/gnn.py.