rmrresearch / GhostFragment

Library focused on fragment-based methods and basis-set superposition corrections
https://rmrresearch.github.io/GhostFragment/
Apache License 2.0
0 stars 4 forks source link

Intersection Finder #48

Open ryanmrichard opened 1 year ago

ryanmrichard commented 1 year ago

Given a FragmentedMolecule object we need to find the set of intersections (also called overlaps). Two fragments in a FragmentedMolecule object are said to intersect if there is at least one nuclei which appears in both fragments. The intersection of two fragments is the set of all nuclei appearing in both fragments. Intersections are not limited to two fragments, e.g. it is possible for three fragments to share a set of common nuclei. In general, for n fragments we will need to consider overlaps arising from two fragments at a time, three fragments at a time, four fragments at a time, all the way up to all n fragments at a time.

In addition to finding the intersections we will need to weight them. For this we use the inclusion-exclusion principle (IEP). If an intersection arises from an even number of fragments it gets a weight of -1, if it results from an odd number of fragments it gets a weight of 1.

As some examples. Consider the following fragments of a 5 nuclei system (number indicate nucleus index and are 1 based):

There are five non-empty pair-wise intersections:

and one non-empty three-way intersection:

The final set of intersections is:

At the moment I'm not sure what the interface should look like so we can write a function which takes a FragmentedMolecule object and returns a std::map<std::vector<std::size_t>, float> object. The return is a key/value structure where the keys are the nuclei in the overlap and the values are their weights. For the above example the final answer would be (note indices are now 0-based to conform to C++ standards):

using overlap_type = std::vector<std::size_t>;
std::map<overlap_type, float> overlaps;
overlaps[overlap_type{0}] = -1.0;
overlaps[overlap_type{2}] = -1.0;
overlaps[overlap_type{3}] = -1.0;