Improve network architecture flexibility

WardLT commented 2 years ago

The current architecture is tuned for molecular energy by predicting the energy as a sum over all atoms. Our models from the older GNN implementation that worked for redox property best predict redox potential based on a whole-graph fingerprint

See our paper on GNNs for solvation energy. The text around Figure 1 explains how moving the readout function can have a large effect.

[ ] Add options to the argument parser that change how the architecture
[ ] Change the make_model function to allow moving the location of the reduce_sum layer
[ ] Skip normalizing by the number of atoms when not using the atomic contributions
[ ] Add the ability to use other reduction functions (e.g., reduce_max, reduce_mean) to map node[/atom]-specific features to a single vector per graph[/molecule]. We find max or (stretch goal) softmax work best for predicting redox potentials

WardLT commented 2 years ago

Once done, evaluate performance on the redox dataset (#2)

WardLT commented 2 years ago

TL;DR: This issue is a low priority because the current network architecture works almost as well as my original.

Turns out my network from the original MPNN is only slightly better. I get an MAE of 0.133 V on the redoxmer dataset #5 compared to the 0.140 V I get using this MPNN.

That architecture using a reduce_sum to combine atomic fingerprints into a molecular fingerprint before passing them through the output MLP. The implementation in this repository does a reduce_sum afterwards, which does not seem to be needed in light of other architectural changes (e.g., this repo using "global" state and my original MPNN does not).

vksastry / gnn_anl_gc_collab

Improve network architecture flexibility #4