ComENet: Confusion around optimal hyper-parameters and training procedure on Molecule3D

fmocking commented 1 year ago

Hello, great work on ComENet, thanks for sharing! The initial question is a duplicate of #155 however, I'm raising this issue for visibility. The selected hyper-parameters are not reported therefore almost impossible to reproduce the results without running the grid search multiple times. It would be helpful to at least share the logs so the community doesn't need to run multiple rounds of grid searches over multiple models.

Secondly, I believe in the QM9 dataset the HOMO-LUMO gap is calculated by getting the difference between HOMO predictions and LUMO predictions. It is not clear if this is the same procedure that is followed in Molecule3D. It would be great if it can be clarified.

fmocking commented 1 year ago

@limei0307, I recommend publishing the trained models, ideally alongside their respective baselines. Your last update on September 17, 2022, indicated that this would be the plan. However, there have been no further updates, which is intriguing since the model runs should have been completed before the paper's submission. I hope you can provide clarification on this matter promptly. As the NeurIPS 23 submission deadline approaches, it is crucial for the research community to have access to the models from a paper published in NeurIPS 22.

limei0307 commented 1 year ago

Hi @fmocking, thanks for your reminder, and sorry for my late reply! As the model architecture and general training code are available in our library, and the hyperparameters for all methods including baseline methods are listed in the paper, therefore, I didn't spend a lot of effort reorganizing all the original code. But you are right that providing all codes is very important for other researchers.

I will try to reorg all the code, including all baseline methods in about one week, and share them somewhere.

Thanks again for your interest in our work and thanks for your patience!

fmocking commented 1 year ago

Hi @limei0307, thank you for your response. I appreciate your effort to reorganize the code and share it with the research community.

I have already patched some code to get the training pipeline working and managed to resolve a few issues on my end, such as the one mentioned in comment #194. Since the model was introduced with the Molecule3D dataset, I'm unsure how this issue was overlooked.

While gathering the training code is helpful, the main concern lies in reproducing the provided hyperparameters. The number of runs required to complete the hyperparameter search is in the order of thousands. Based on the time per epoch mentioned in the paper, it would take years to finish the search when conducting the search with 100 GPUs, which is not feasible for many researchers. I'm puzzled as to why you are hesitant to share the best hyperparameters that you discovered during your search. Since the results are presented in the paper, I assume you had to use these hyperparameters to train the model. Could you please reconsider sharing those values to facilitate the research community's efforts?

limei0307 commented 1 year ago

Hi @fmocking, sure. Here is the table (table 6-9 in our paper) for the exact value we used. In addition to the code, I will also try to upload the trained models in about one week. Thanks again for your patience!

fmocking commented 1 year ago

Hi @limei0307, thank you so much for providing the exact values used in your paper. Your effort to share these details is greatly appreciated, and I am sure it will benefit many researchers in the community.

In addition, I am looking forward to accessing the trained models once they are uploaded. I understand that organizing and uploading these resources requires a significant amount of work, and I truly appreciate your dedication to supporting the research community.

Thank you again for your hard work and timely responses. I wish you the best in your future research endeavors!

limei0307 commented 1 year ago

Hi @fmocking, thanks again for your interest in our work. Here (ComENet_Molecule3D_code_model_data) are the original code, trained models, etc. of all methods (GIN-Virtual, SchNet, DimeNet++, SphereNet, and ComENet) on the Molecule3D dataset. You can run the run.sh file in the code folder to make predictions using the trained model.

About the hyperparameters for all methods, please also see the run.sh file in the code folder.

About the Molecule3D dataset, please see this issue.

Thanks!

divelab / DIG

ComENet: Confusion around optimal hyper-parameters and training procedure on Molecule3D #196