NSAPH-Projects / topological-equivariant-networks

E(n)-Equivariant Topological Neural Networks
MIT License
19 stars 0 forks source link

lift fully-present constraint #2

Closed ekarais closed 8 months ago

ekarais commented 8 months ago

The original implementation suffers from the following limitation, which I named the fully-present constraint: Let's say we want to lift our graphs to simplicial complexes of rank 2: i.e. the highest allowed rank of a simplex is 2 (a triangle). We can specify this desire by setting --dim 2. If we set this, then the implementation assumes that in each graph in the dataset, there is at least one rank-2 simplex (triangle). If this assumption does not hold, then the script will break at multiple points. This happens for example if we set the radius threshold for the Rips-Vietoris to something below 3: some graphs still have triangles while others won't. It can also happen with rings/functional groups. Not all molecules will have them. The purpose of this PR is to lift the fully-present constraint.

EMPSN predicts the graph-level target by first aggregating the hidden representations of all nodes of all ranks to one representation per rank. If we set --dim 2, then EMPSN will sum the hidden representations of all rank-0 nodes, all rank-1 nodes, and all rank-2 nodes to arrive at 3 hidden representations. The question is: what should be the aggregate hidden representation for rank-2 if the graph does not have any rank-2 simplexes? I gave it some thought and chose to use a feature vector that is all zeros but I would be happy to discuss alternatives.

After fixing this issue, running the script with the offending arguments from the previous PR should also work with no issues:

python main_qm9.py --target_name alpha --dim 2 --dis 0.5 --epochs 5 --num_hidden 20 --num_layers 3