Closed nawtrey closed 2 weeks ago
Here are some links where people have discussed algorithms for collecting all spanning trees:
With https://github.com/Becksteinlab/kda/commit/61a1fcbd0aab22f0075a1438e0977af32ed1918f merged I've checked off the directional diagram task.
Upon further consideration I don't know if generate_partial_diagrams
is really a rate-limiting step for KDA right now.
For example, using a maximally-connected 7-state diagram, the current algorithm generated the 16807 partial diagrams in roughly 1 second:
[70.83%] ··· ...alDiagrams.peakmem_generate_partial_diagrams ok
[70.83%] ··· ============== ======= =======
-- return_edges
-------------- ---------------
graph True False
============== ======= =======
3-state 80.3M 80.3M
Hill-5-state 80.4M 80.4M
Hill-8-state 80.6M 80.5M
EmrE-8-state 80.6M 82.1M
Max-7-state 81.3M 138M
============== ======= =======
[75.00%] ··· ...rtialDiagrams.time_generate_partial_diagrams ok
[75.00%] ··· ============== ============= ============
-- return_edges
-------------- --------------------------
graph True False
============== ============= ============
3-state 225±9μs 215±3μs
Hill-5-state 459±10μs 435±20μs
Hill-8-state 2.69±0.02ms 2.94±0.2ms
EmrE-8-state 16.7±1ms 15.3±0.1ms
Max-7-state 1.07±0.05s 960±10ms
============== ============= ============
Comparatively, the new directional diagrams algo requires ~13 seconds to generate both the partial diagrams and directional diagrams:
[54.17%] ··· ...agrams.peakmem_generate_directional_diagrams ok
[54.17%] ··· ============== ======= =======
-- return_edges
-------------- ---------------
graph True False
============== ======= =======
3-state 80.4M 80.5M
Hill-5-state 80.7M 80.9M
Hill-8-state 80.9M 83.5M
EmrE-8-state 82.5M 102M
Max-7-state 152M 818M
============== ======= =======
[58.33%] ··· ...lDiagrams.time_generate_directional_diagrams ok
[58.33%] ··· ============== ============ ============
-- return_edges
-------------- -------------------------
graph True False
============== ============ ============
3-state 623±100μs 791±100μs
Hill-5-state 2.82±0.2ms 4.14±0.4ms
Hill-8-state 23.0±0.4ms 46.7±2ms
EmrE-8-state 179±8ms 328±20ms
Max-7-state 8.28±0.04s 12.7±0.07s
============== ============ ============
So while it may be good to improve the partial diagram algorithm for other reasons, performance is probably not a main concern. If I can find any way to improve partial diagram algorithm performance I will probably close this issue.
Problem
The primary limitation for graph complexity is the space complexity of functions like
kda.diagrams.generate_partial_diagrams()
andkda.diagrams.generate_directional_partial_diagrams()
. This is inherently related to issue #2, where this issue is focused on one part of that issue.This problem is 2 fold:
While 2. does contribute to the issue, the main issue is 1., especially for simpler input graphs. At the moment, the partial diagrams are generated by finding every possible combination of N-1 edges then discarding subgraphs that don't fit the definition of a partial diagram. Complete input graphs contain N(N-1)/2 unique edges, and thus have N^(N-2) partial diagrams (after doing the appropriate combinatorics). These yield approximately N! possibilities (using Stirling's approximation), so generating all possibilities is probably appropriate. But for simpler input graphs the number of partial diagrams is much less numerous so generating every possibility is likely not necessary.
Solution(s)
Before I list the possible solutions, I have to mention that in graph theory a "partial diagram" is known as a "spanning tree", so the solutions below will be using the graph theory language.
There are many solutions to this problem, many of which have formulas for calculating the space complexity of the algorithm. At the moment it isn't obvious which method is the best method, so I'm just going to dump links below.
I was going to list more, but the other articles/papers I have are all references in the above paper so for now I will leave it here.
Tasks
generate_partial_diagrams
generate_directional_diagrams