Closed hongshuh closed 11 months ago
You probably don't actually need to compute the model-predicted hull distances. The formation energy fully determines the hull distance so you can just compute your model's formation energy MAE during hyperparam tuning. That will be identical to the model's hull distance MAE.
If you still want to look at your model-predicted hull distances for other reasons, there are two options.
e_form
to predicted hull_dist
. See
https://github.com/janosh/matbench-discovery/blob/3549db39f150f30824f9b8d67ec434e3df5cd275/matbench_discovery/preds.py#L162-L166e_form_per_atom
back to energy
:
energy = e_form_per_atom * n_atoms + sum(elem * elem_ref_energies[elem] for elem in composition)
Assign energy
as the uncorrected_energy
of the WBM ComputedStructureEntry
and pass it to ppd_mp
the same way you linked above:
https://github.com/janosh/matbench-discovery/blob/297251c0f24bc2f62e6fb5a6ab0a1883cff29e9e/data/wbm/fetch_process_wbm_dataset.py#L559
Let me know if things are still unclear.
Thanks, I understand the MAE could be used for tuning the models, but it seems like MAE doesn't measure the performance correctly. MAE of CGCNN is 0.14 and $R^2$ is -0.61, but the result of classification is still good compared to CGCNN+P with a lower MAE. I am confused by this, so I would like to see the classification result by e_above_hull
.
Nice, I see you've already adopted one of the primary insights from this work. Don't only look at regression metrics. 😄 Even though very prevalent in the literature, they are not the right performance indicators for stability prediction.
Like you said, the random structure perturbations in CGCNN+P improve regression performance but negatively affect classification accuracy. This was very surprising to me at first but the reason for this becomes clear from looking at the classification histograms.
CGCNN+P is more sharply peaked, meaning more of its predictions fall close to the decision boundary at 0 eV/atom above the hull. In this regime, even small errors are large enough to nudge a correct classification over that boundary and turn it into an incorrect one.
And that's also a reson I have the questions before https://github.com/CompRhys/aviary/issues/72#issuecomment-1555958141. I've tried to train the model with CrossEntropy Loss and do classification directly and turns out the performance much worst than regression.
Out of curiosity, which model are you training?
Our transformer model,The performance in Matbench is similar to Wrenformer, but much worst in WBM.
Interesting. What's the input to your model (if you want to share)?
I take the composition and space group as input, though our model is large(it only requires around 4G memory with 128 batch size, which surprises me) it underperform in some dataset like perovskites and jdft2d.
@janosh I tried to use the true formation energy and convert it back to energy and get true e_above_hull, but founds the energy doesn't match. For example, wbm-1-1
, formula is "Ac6 U2", which formation energy per atom is 0.544327. Then I load the mp_elemental_ref_entries
and get the reference energy for "Ac" (-16.48470003) and "U" (-22.58282002), however $8 * 0.544327 + (-16.48470003+-22.58282002) = -34.71290405$ but the uncorrected_energy here should be -42.954, Did I make any mistake here or I miss some of the steps?
Thanks for the previous reply, I am able to obtain the ground truth of
e_above_hull
. But how can I get the predictede_above_hull
from formation energies? It is using theComputedStructureEntry
as input.https://github.com/janosh/matbench-discovery/blob/297251c0f24bc2f62e6fb5a6ab0a1883cff29e9e/data/wbm/fetch_process_wbm_dataset.py#L559
Sorry for bothering, I am developing a new model and trying to test it on different benchmarks, It would be great to know the performance with different hyperparameters before contributing to MBD.