Open hongshuh opened 1 year ago
I believe @hrushikesh-s is currently working on submitting Wrenformer to Matbench. Maybe he can teel you more.
In case you haven't seen there are some preliminary results for various Wrenformer hyperparameter settings plotted in #44.
@hongshuh Also, what are you planning on using Wrenformer for? If discovery, these results might interest you: https://matbench-discovery.materialsproject.org/preprint#results.
Yea, I am also following the discorvery benchmark, It seems to handle the task as a regression problem by predicting the energy above hull, rather than treating it as a classification task of identifying whether a material is stable or not. I am a bit puzzled by this approach, since the aim seems to be the identification of stable materials, which would intuitively seem to be a classification task.
It seems to handle the task as a regression problem by predicting the energy above hull
That's right.
rather than treating it as a classification task of identifying whether a material is stable or not. I am a bit puzzled by this approach since the aim seems to be the identification of stable materials, which would intuitively seem to be a classification task.
I have some preliminary results which suggest doing direct classification does not improve over regression. But I think that's definitely something that could be investigated further. If you want to check how well a Wrenformer stability classifier performs compared to the Wrenformer regressor, that would be a very welcome contribution to MBD!
This section from Bartel et al. 2021 is also relevant here:
As an additional demonstration, all representations (except Roost—see “Methods” for details) were also trained as classifiers (instead of regressors), tasked with predicting whether a given compound is stable (ΔHd ≤ 0) or unstable (ΔHd > 0). The accuracies, F1 scores, and false positive rates are tabulated in Supplementary Table 2 and found to be only slightly better (accuracies < 80%, F1 scores < 0.75, false positive rates > 0.15) than those obtained by training on ΔHf (Fig. 4) or ΔHd (Supplementary Fig. 4).
I have some preliminary results which suggest doing direct classification does not improve over regression.
Thanks! Maybe the regression values provide more information to the model than just "stable" or "unstable" labels.
I saw in the commit history that you have conducted some experiments in matbench benchmark, It's a very good idea and model, but I may not have enough computational resources to run it, I would like to know if you have final resuts?