materialsproject / matbench

Matbench: Benchmarks for materials science property prediction
https://matbench.materialsproject.org
MIT License
124 stars 47 forks source link

new_benchmark #187

Closed PatReis closed 2 years ago

PatReis commented 2 years ago

Matbench Pull Request

Add MegNet benchmark from "Graph Networks as a Universal Machine Learning Framework for Molecules and Crystals" by Chi Chen.

ardunn commented 2 years ago

@hrushikesh-s

ardunn commented 2 years ago

@PatReis thanks again for the submission! It is much appreciated! Also, I have to say the code implementation you provide with the kcgnn package is very clean and interpretable. Under the "algorithm_long" key, would you mind putting some more details about the training and/or hyperparameters (and how they were selected)? Even if it is just "Default hyperparameters were used according to original publication (n_layers, etc.). It just makes it easier to read on first inspection without having to dig through the reported hyperparameters in dict format.

ardunn commented 2 years ago

@hrushikesh-s when you review this PR, could you load the benchmark into an object and compare the numbers with what is reported in our original megnet publication, and see if there are any large discrepancies (and potentially investigate why)?

The way you would do this is as follows:

  1. Add @PatReis's fork as a git remote
  2. Create a new branch on git and merge in the changes from @PatReis remote main branch
  3. Either with a script or ipython or jupyter, load the results.json.gz with the MatbenchBenchmark.from_file(...)
  4. Look at the scores attribute for each task
PatReis commented 2 years ago

Sure, I will do.

PatReis commented 2 years ago

Is this better?

ardunn commented 2 years ago

Is this better?

Yes that looks good to me, but for clarification what are the "QM runs"?

PatReis commented 2 years ago

Ah, sorry, I updated it. I just meant from training on QM9 (QM7) datasets, which is usually given in the papers. So the hyperparameter with which the results on QM9 can be reproduced.

ardunn commented 2 years ago

Ah, sorry, I updated it. I just meant from training on QM9 (QM7) datasets, which is usually given in the papers. So the hyperparameter with which the results on QM9 can be reproduced.

Oh great! Yeah, if you submit a new one with more optimized hyper parameters that would be very interesting to compare to this. Though looking at the results here they are very good and line up quite well with the original paper. It is still very interesting to me that MEGNet does so well on the phonon DOS problem

I'll leave it to @hrushikesh-s to merge this in when he sees fit