igraph / igraph

Library for the analysis of networks
https://igraph.org
GNU General Public License v2.0
1.75k stars 405 forks source link

Steiner Tree implementation #1803

Open Nachiket18 opened 3 years ago

Nachiket18 commented 3 years ago

What is the feature or improvement you would like to see? Steiner Tree problem is one of the 21 NP-Complete problem suggested by Richard Karp. For a given graph G(V,E), Steiner tree is a tree which has minimum length and connects subset of vertices in the graph. If the subset of vertices is all vertices in the graph then problem is reduced to minimum spanning tree that can be solved in polynomial time using various algorithms. In this implementation we need to implement FPT (Fixed Parameter Tractable) algorithm as it's NP problem. Further implementations can have approximate algorithms.

Use cases for the feature A simple use case can be a road graph where we have to find our minimum length route between multiple cities.

References

  1. https://en.wikipedia.org/wiki/Steiner_tree_problem
  2. Dreyfus-Wagner algorithm (https://onlinelibrary.wiley.com/doi/abs/10.1002/net.3230010302)
  3. Technical blog - https://medium.com/dataengineering-and-algorithms/steiner-problem-in-graphs-9d63ab6214de
Nachiket18 commented 3 years ago

The algorithm implementation needs use of data structure for mapping subsets to a number. I noticed that igraph has hashtable designed for string, can similar data structure be created for 'int' or there's any known way to implement hashtable for 'int' data type.

PS. Implementation can be done using 'map' as well.

ntamas commented 3 years ago

Well, the "hashtable" is not really a hash table anyway -- I guess it is implemented with a trie and a few vectors, and there's no int variant for it. Can you please elaborate a bit on why you need a hashtable for ints, i.e. what sort of subsets you would like to map to numbers? Maybe there's a way around that with some alternative data structure that igraph already provides. If there isn't, the easiest is probably to allow C++ for this part of the library and to use an std::map<>

Nachiket18 commented 3 years ago

Thanks for coming back on this. Yes, I went through hashtable.c code and it's implemented using tries and vectors. So, what our algorithm is trying to do is , we take terminals (known as steiner terminals that are subset of vertices in graph G) from user and we want the minimum path tree between these vertices. In order to do that algorithm creates subsets of the steiner terminals of length 2,3, and so on. We then map these subsets to integers. During further process when we find distances from these subsets of terminal vertices to other vertices. We then store them into dynamic programming table where the mapping for subset to 'int' which was previously done is used.

If there's a way to do this using igraph data structures, that would be useful. I would think in that direction. If not then, I guess I can switch to std:map<>.

vtraag commented 3 years ago

Are the subsets non-overlapping? F that is the case, you could map each node of a subset to an integer, indicating which subset it is part of. You could then also track for each integer, all nodes that belong to that subset in a simple vector of vectors. Not sure if this would work on this particular case, just a thought...

ntamas commented 3 years ago

@Nachiket18 I think the easiest is just to go ahead with an std::map<> for the time being and get to a working implementation. Once we have that (and we have test cases for that), we can start thinking about whether it's worth providing a replacement for the map in pure C or not. igraph already has bits and pieces of C++ code so I don't think that having yet another part that uses C++ is a big deal as long as the public API is pure C.

Nachiket18 commented 3 years ago

@ntamas - Thanks. We will start with std::map<>

szhorvat commented 3 years ago

This is a reminder that it's best to do the development in the open, and show us what you've done so far even before it's complete. Feel free to open a pull request right now, and mark it as "draft". You do not need to have a clean set of commits, as we will squash them anyway before merging the PR---so don't worry about that. Also, feel free to keep asking even seemingly trivial questions as you go. It's rare that people's questions are "too simple", it's usually the opposite: it would have been better to ask early than go down the wrong path and waste work ...

We want to avoid the situation when people submit a large but flawed PR which is difficult to review, and needs many modifications before it can be merged. In my experience, giving frequent feedback in small chunks works out better than dealing with a large PR in one go.

Nachiket18 commented 3 years ago

Thanks @szhorvat . We will create a pull request soon as we have done some work with respect to coding. Will do a internal review and basic testing and create a pull request.

szhorvat commented 3 years ago

Will do a internal review and basic testing

Don't worry if it might not yet work correctly. Feel free to submit the PR and keep working on it / fixing it well before it's ready for merging. If you can show us the code, it also makes it easier for you to ask questions about various aspects of it.

szhorvat commented 3 years ago

Not sure if I mentioned it before, you might find the wiki helpful: https://github.com/igraph/igraph/wiki

Nachiket18 commented 3 years ago

Thanks @szhorvat . We have created a draft pull request Steiner Tree calculation #1809 . I have a doubt about some issues with implementation.

What I am trying to do -

I want to pass all the edges of a particular vertex to a function. This function is named steiner ratio which as the name suggests calculates a ratio of sum of edges passed divided by number of terminals.

Now, while designing I found about edge selectors and iterators but when I saw implementation of the djiktra algorithm, I see that weights are passed separately as a vector instead of part of graph object. Should the same approach be followed here? If that's case I would still have to write a code someplace else which would extract all the edge weights for a vertex. I'm little confused what is the function that extracts the weight of edge based on edge_id fetched while iterating the edges.

szhorvat commented 3 years ago

At the moment I am super busy, and can't look into the algorithm, so my answer may not be useful (perhaps someone else will chime in?) But here are some thoughts, based solely on your message here, without studying the algorithm.

I want to pass all the edges of a particular vertex to a function.

The "incidence list" in igraph contains exactly this information. https://igraph.org/c/doc/igraph-Data-structures.html#incident-edges There is also igraph_incident which computes it for one vertex.

Now, while designing I found about edge selectors and iterators

There is igraph_es_incident as well. You can use that for incident edges, in a similar way to igraph_incident.

However, if you need to retrieve incident edges for the same vertex multiple times, it is better to create aninclist first and use that.

what is the function that extracts the weight of edge based on edge_id

Normally, in igraph, weights are stores as a vector w where the edge with index i has weight w[i]. Just use the edge ID to index the weight vector. I hope this answers your question.