Open shivensinha4 opened 2 years ago
If it sounds good, I'd love to work on implementing these ideas in a separate function.
Fill free to give it a try.
I don't know if better algorithms have been proposed for undirected graphs. I have chosen to implement the Roskind-Tarjan
algorithm without extra optimization to get something easy to maintain. Now that we have it, we can have other implementations and we are able to compare results and running times.
Description changed:
---
+++
@@ -1,3 +1,5 @@
+Concern method `sage.graphs.spanning_tree.edge_disjoint_spaaning_trees` for undirected graphs introduced in #32911 as part of #32169.
+
- `edge_label` is currently a stored as a dictionary with frozensets of edge endpoints as keys. Since our structure is a collection of forests, we can utilise the fact that each edge can be uniquely represented as the parent edge of some node. We can attempt to replace this dictionary with an integer array (even a C array for better performance, we don’t even need resizes or appends). Then, we can structure the code to perform lookups using node indices and trace back the augmenting paths using the same idea. The containers for storing parents (`p`), which is currently a list of dicts, can be similarly restructured.
- We can also speed up the initialisation of the labels before each labelling step. Instead of initialising a new label container on each edge-independence check iteration, we can keep track of the last iteration that changed the label. When we need to access the labels anytime during our processes, we first check if the last *edit time* for this label was during this iteration, and lazily perform updates. The same array is reused throughout all edge-independence checks instead of new dictionaries, so this will also make it more memory efficient. Even using another array to store the *edit time* would be a nice idea, but we can go further and just hash the pairs of `(time, label)` into a single integer of `(time * maximum_label_bound) + label`
This seems like something that could be ripe for a medium-size GSoC idea.
Concern method
sage.graphs.spanning_tree.edge_disjoint_spaaning_trees
for undirected graphs introduced in #32911 as part of #32169.edge_label
is currently a stored as a dictionary with frozensets of edge endpoints as keys. Since our structure is a collection of forests, we can utilise the fact that each edge can be uniquely represented as the parent edge of some node. We can attempt to replace this dictionary with an integer array (even a C array for better performance, we don’t even need resizes or appends). Then, we can structure the code to perform lookups using node indices and trace back the augmenting paths using the same idea. The containers for storing parents (p
), which is currently a list of dicts, can be similarly restructured.We can also speed up the initialisation of the labels before each labelling step. Instead of initialising a new label container on each edge-independence check iteration, we can keep track of the last iteration that changed the label. When we need to access the labels anytime during our processes, we first check if the last edit time for this label was during this iteration, and lazily perform updates. The same array is reused throughout all edge-independence checks instead of new dictionaries, so this will also make it more memory efficient. Even using another array to store the edit time would be a nice idea, but we can go further and just hash the pairs of
(time, label)
into a single integer of(time * maximum_label_bound) + label
As @dcoudert pointed out, the cost of this approach is that the vertex labels can’t internally be of any hashable type - only integers. We’ll first need to relabel the vertices if they are of other types. We can then also go ahead and use a specific data structure to speed up iterations (e.g. static sparse graph).
CC: @dcoudert
Component: graph theory
Issue created by migration from https://trac.sagemath.org/ticket/33634