Open scal444 opened 2 years ago
I thought there was already self-RDF but apparently not. I think this is probably a good idea, and making it generalisable to allow any width of diagonal band to be excluded sounds useful for e.g. polymers where you don't want to exclude the same chain, but some sort of length.
One sticky point in implementing this is currently numpy doesn't really offer a nice way of implementing a set of pairs of ints (i.e. std::map<std::pair<int, int>>
). This would also be useful in our bond algorithms, so having a nice implementation in lib._cutil
would be super handy.
Could one use a similar approach as used for exclusions to get single counts?
Is your feature request related to a problem?
For InterRDF, when sel1 and sel2 contain overlapping atoms, the resulting counts (if within the distance cutoff) give hits for all combinations of particles. For example, with 4 particles in each selection, there are 16 hits (4x4). This includes 4 zero distance hits for (0,0), (1,1), and so on. It also doubles the same distance count for (1,2), (2,1).
Describe the solution you'd like
I'm not sure if there's one correct way to address this, but I think it makes more sense not to count self 0 distance hits. I'm less certain on the (2,1), (1,2) double-counting problem. For one data point, Gromacs will only count one hit, so for the example above there would be 6 total entries in counts - (0,1), (0,2), (0,3), (1,2), (1,3), and (2, 3).
If that approach were to be adapted, we could also use the self-search optimization in NSGrid for the common case of the two input selections being identical (in my research, that was pretty much exclusively what I did), which cuts the runtime of the analysis about in half.
Describe alternatives you've considered
There could instead be a self-rdf class if we wanted both behaviors to exist, or a flag to switch between them in InterRDF.
Additional context