NSAPH-Projects / topological-equivariant-networks

E(n)-Equivariant Topological Neural Networks
MIT License
19 stars 0 forks source link

adapt and enhance geometric features #15

Closed ekarais closed 7 months ago

ekarais commented 7 months ago

Description

This pull request introduces several enhancements and fixes to the geometric feature calculation functionality within our codebase. The legacy geometric feature computation logic had hardcoded assumptions about the cells of rank 0,1,2 as well as the neighborhood relationships between them. On the one hand, these assumptions allowed the authors to craft very specific features such as the exhaustive list of angles between cells of certain ranks. On the other hand, it was impossible to scale such features to new neighborhood relationships or more loosely defined ranks. This pull request introduces two main changes:

  1. Refactoring the geometric feature computation logic.
  2. Introducing new, scalable geometric features that make minimal assumptions about the cells and the neighborhood relationships.

    New geometric features

    We remove all of the existing geometric features, and replace them with 5 new ones:

    Distance between centroids Euclidean distance between the centroids of cell pairs. The centroids are computed only once per rank and memoized to improve efficiency. The distances serve as geometric invariants that characterize the spatial relationships between cells of different ranks.

    Maximum pairwise distance within-cell For both the sender and receiver cell, the pairwise distances between their nodes are computed and the maximum distance is stored. This feature is meant to very loosely approximate the size of each cell.

    Two Hausdorff distances We compute two Hausdorff distances, one from the sender's point of view and one from the receiver's. The Hausdorff distance between two sets of points is typically defined as

        H(A, B) = max{sup_{a in A} inf_{b in B} d(a, b), sup_{b in B} inf_{a in A} d(b, a)}
    
    where A and B are two sets of points, d(a, b) is the Euclidean distance between points a and
    b, sup denotes the supremum (least upper bound) of a set, and inf denotes the infimum
    (greatest lower bound) of a set. Instead of taking the maximum, we instead return both of
    the terms. This choice allows us to implicitly encode the subset relationship into these
    features: the first Hausdorff distance is 0 iff A is a subset of B and the second Hausdorff
    distance is 0 iff B is a subset of A.

We feel that this set of features is a reasonable base to start with. If we feel the need, we can add more features in the future.