alex-87 / HyperGraphLib

C++ Hypergraph modelling Library using Boost and OpenMP with some algorithms, including isomorphism using Gecode.
https://alex-87.github.io/HyperGraphLib
MIT License
21 stars 3 forks source link

Documentation for isomorphism algorithm #13

Closed lady-bluecopper closed 3 years ago

lady-bluecopper commented 3 years ago

Hello, could you please share some information about the implementation of the hypergraph isomorphism algorithm? Did you implement some procedure taken from a published paper, or is it your own implementation? How does it work? Thanks!

alex-87 commented 3 years ago

Hello,

Sure ! Here is the explanation on how does the lib proceeds. For information, I didn't took this algorithm for any published paper (I don't know if any paper mentioning a similar algorithm exists). I decided to tackle this computational problem by passion for Hypergraphs and CSP (Constraint Solving Problem). This library was a school project. However, I've took information on hypergraphs from here :

Modelling the CSP

The goal is to decide whether a mapping exists between vertex / edge adjacency of two hypergraphs. The modelling algorithm is in three parts :

The algorithm

The algorithm is simple :

Solve it with Gecode

In this section, an overview of the Gecode modelling. The complete source code of this section is available here

Modelling the adjacencies

Gecode offers a good way to model and solve the CSP.

The first part (To model the adjacency between edges and vertex of the given hypergraph A) :

  for(boost::shared_ptr<HyperEdge>& e : edgeA ) {
    for(boost::shared_ptr<HyperVertex>& v : vertexA ) {

      // If the vertex is contained in the current edge
      if( e->containVertex(v) ) {
        Gecode::rel(*this, _bVarEdgeA[ j ], Gecode::IRT_EQ, j);
      } else {
        Gecode::rel(*this, _bVarEdgeA[ j ], Gecode::IRT_EQ, 0);
      }

      // Increment the relation index
      j++;
    }
  }

The second part (The same as the first part, for the second hypergraph)

  for(boost::shared_ptr<HyperEdge>& e : edgeB ) {
    for(boost::shared_ptr<HyperVertex>& v : vertexB ) {

      // If the vertex is contained in the current edge
      if( e->containVertex(v) ) {
        Gecode::rel(*this, _bVarEdgeB[ j ], Gecode::IRT_EQ, j);
      } else {
        Gecode::rel(*this, _bVarEdgeB[ j ], Gecode::IRT_EQ, 0);
      }

      // Increment the relation index
      j++;
    }
  }

The thrid part (To indicate a relation between the adjacencies models)

  int u( 0 );

  for(int g=0; g < edgeA.size(); g++) {
    for(int h=0; h < vertexA.size(); h++) {

      // Linking the hypergraph adjency structure
      Gecode::element(*this, _bVarEdgeA, _varEdge[u], _bVarEdgeB[u]);
      u++;
    }
  }

The fourth part (Distinct in values)

  // To make index identifier unique for the mapping array
  Gecode::distinct(*this, _varEdge );

Searching a solution

Once the model is set, the following code initializes and proceeds to the research of a solution :

  // Initializing the space search
  IsomorphSpace * is = new IsomorphSpace(_ptrHypergrapheAbstraitA, _ptrHypergrapheAbstraitB);

  // Posting the model contraints
  is->postConstraints();

  // Initializing the searcher (Deep-First Search)
  Gecode::DFS<IsomorphSpace> ensembleSolution(is);

  // If the system has a solution, then the hypergraphs are isomorph
  if( ensembleSolution.next() ) ret = true;
  else ret = false;

  // Cleaning
  delete is;

In accordance with the find of a mapping function, if there is a solution, the two hypergraphs are isomorph.

I hope I've answered to your questions, Best regards.

lady-bluecopper commented 3 years ago

Thanks for the reply! If I understood it correctly, you are basically trying to map the vertices of the first hypergraph to the vertices of the second one, under the constraints given by the two adjacency matrices A and B. I guess Gecode provides an exact solution to this matching problem. Do you happen to know its complexity? Do you think it could be possible to use your code to obtain canonical forms for hypergraphs? Since I need to compare a lot of hypergraphs, a canonical form could allow me to perform linear-time comparisons. Thanks again!

alex-87 commented 3 years ago

Concerning the complexity, to schematize the algorithm, given the two following hypergraphs:

Hypergraphs

The first computation creates the following table for the first hypergraph U:

Edge Vertice Rel. Id
1 A 1
1 B 2
1 C 0
2 A 0
2 B 5
2 C 6

Rel. Id is an unique number that identifies the link between an edge and a vertice. When an edge don't have a link to a vertice, Rel. Id is set to 0. The identifier increments at each link or non-link

For the second hypergraph V :

Edge Vertice Rel. Id
3 D 1
3 E 2
3 F 0
4 D 0
4 E 5
4 F 6

The third computation, linking relation structures :

i U[i] M[i] V[M[i]]
0 1 x_0 1
1 2 x_1 2
2 0 x_2 0
3 0 x_3 0
4 5 x_4 5
5 6 x_5 6

As Gecode's role is to find x_i of M = {x_0, ..., x_5}, I would say that with S = [Number of Edges] x [Number of Vertices], and considering the fact that for all i and j where i != j the distinct constraint set x_i != x_j, the complexity would be O(S!).

The searching process done by Gecode is the Depth-First search algorithm, detailed here https://www.gecode.org/doc-latest/MPG.pdf#section.41.3, and bases the node choices on the branching methodology, detailed here https://www.gecode.org/doc-latest/MPG.pdf#section.8.2. Gecode performs some optimizations that could reduce processing time.

In this library, the branching is Gecode::branch(*this, _varEdge, Gecode::INT_VAR_SIZE_MIN(), Gecode::INT_VAL_SPLIT_MIN());

Concerning the use of the code to obtain canonical forms for hypergraphs, yes, I've just added the MIT License to allow the use of the library code for any purpose.

lady-bluecopper commented 3 years ago

Thanks for all the details!