GT-TDAlab / dagP

Multilevel Directed Acyclic Graph Partitioner
GNU Lesser General Public License v3.0
28 stars 5 forks source link

API Issue #1

Closed cheshmi closed 3 years ago

cheshmi commented 3 years ago

I am trying to use DAGp using its API, i.e., useapi.cpp like below: ./api_test https://sparse.tamu.edu/Um/2cubes_sphere 12 it gives me seg fault.

However, when I use the same matrix using the dagP CLI, it prints a reasonable output: ./exe/rMLGP https://sparse.tamu.edu/Um/2cubes_sphere 12 Output: "Graph Information: Nb node: 101492 Nb Edges: 772886 Max in-degree: 23 Max out-degree: 28 Av. in-degree: 7.62 Av. out-degree: 7.62 Problem Information: Nb part: 12 Lower Bound[0]: 1.0 Upper Bound[0]: 8711.4

########################## RUN 0 (seed=1614136284) ######################## Partition: Edgecut: 98685 Balance: 1.029953 VCycle depth: 8 Vertex Contraction: 0.004 Edge Contraction: 0.006 Edge Weight Contraction: 0.448 Times in seconds: Coarsening: 0.103 Initial Partition: 0.001 Uncoarsening: 0.024 Total: 1.849 ######################## END RUN 0 ########################

Average Edgecut:98685 Standard Deviation: 0"

Any thought what is wrong? Thanks

myozka commented 3 years ago

Hello,

Thank you for pointing this issue out! Most likely cause is the line below: https://github.com/GT-TDAlab/dagP/blob/77881ae66f2426fcfa98d596e37fe19ee0abbcef/src/useapi.cpp#L21 https://github.com/GT-TDAlab/dagP/blob/77881ae66f2426fcfa98d596e37fe19ee0abbcef/src/useapi.cpp#L22

Fixed in commit f836c1b3fd0deae738f5ab757544e2c73f90a806

Please pull the latest version and try again.

Please reach out if you have any problems!

myozka commented 3 years ago

We are planning a version update soon, let me know if there is anything you would like us to consider.

cheshmi commented 3 years ago

Thanks Yusuf. I noticed that different runs of the code for the same matrix give different number of edge-cuts. Is this expected? image

myozka commented 3 years ago

Yes, this is a heuristic based algorithm. It may return varying edge cuts and different part assignments. We recommend returning "best of n" runs if you can afford it (as in, running it with "-n 5" parameter to report best of 5 runs).

myozka commented 3 years ago

If you would like, you could set the random seed and work with the same output across different runs as well. If you need any clarification on any command line arguments, let me know. I'd be happy to clarify.

cheshmi commented 3 years ago

I appreciate your help Yusuf. The seed part is clear. I have another question. I am trying to partition a DAG, i.e., a lower triangular part of a symmetric matrix. Here is the code I have:

 void dagp_partition_from_file(char *const *argv, int num_parts, int *&parts) {
  dgraph G;
  MLGP_option opt;
  idxType nbParts = num_parts;
  dagP_init_parameters(&opt, 2); // initializes default parameters
  dagP_init_filename(&opt, argv[1]); // initialize the input file name and then the  output file name
  dagP_opt_reallocUBLB(&opt, nbParts);
  dagP_read_graph (argv[1], &G, &opt);
  parts = (idxType *) calloc((G.nVrtx + 1), sizeof(idxType));
  if (parts == NULL)
   printf("Could not allocate `parts` array.\n");

  ecType x = dagP_partition_from_dgraph(&G, &opt, parts);

  printf("edge cut: %d\n", (int) x);
  dagP_free_option(&opt);
  dagP_free_graph(&G);
 }

/// This function converts the input DAG to the partitioned DAG based on the computed partitioning in parts:
 int convert_partition_to_p_graph(CSC *G, int n_parts, int *parts){
  //print_csc(1, "G:\n", G);
  std::vector<std::set<int>> partitioned_dag;
  partitioned_dag. resize(n_parts);
  for (int i = 0; i < G->n; ++i) {
   int src_p = parts[i];
   assert(src_p < n_parts);
   for (int j = G->p[i]; j < G->p[i + 1]-1; ++j) {
    int cur_v = G->i[j];
    int cur_p = parts[cur_v];
    assert(cur_p < n_parts);
    if(cur_p != src_p)
     partitioned_dag[src_p].insert(cur_p);
   }
  }
  std::cout<<"\n";
  for (auto k : partitioned_dag) {
   for (auto l : k) {
    std::cout<<l<<", ";
   }
   std::cout<<"\n";
  }
 }

And here is my main:

int main(int argc, char *argv[]){
/// ... read CSC
 int *parts; int num_parts=12;
 dagp_partition_from_file(argv, num_parts, parts);
 convert_partition_to_p_graph(L1_csc, num_parts, parts );
 free(parts);
}

And here is the output for https://sparse.tamu.edu/Um/2cubes_sphere :

edge cut: 98282

1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 
0, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 
0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11, 
0, 1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 
0, 1, 2, 3, 5, 6, 7, 8, 9, 10, 11, 
0, 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 
0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 
0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 
0, 1, 2, 3, 4, 5, 6, 7, 9, 10, 11, 
0, 1, 2, 4, 5, 6, 7, 8, 10, 11, 
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 11, 
0, 1, 2, 3, 4, 5, 6, 8, 9, 10, 

I am not sure what setting is wrong in my end that makes the partitioned graph cyclic. If you could help me with this, I'd really appreciate it.

myozka commented 3 years ago

It might be something simple. The graphs are stored 1-based in dagP, same as matrix market format. (first node is node 1, and last is node G->nVrtx, as shown in useapi. parts are assigned zero-based). So, when accessing parts array, you should start from parts[1] and go to parts[G->nVtrx] Can you fix that and check again?

Also, you should be able to call dagP_init_parameters(&opt, num_parts); and not have to call dagP_opt_reallocUBLB(&opt, nbParts); afterwards.

Should effectively do the same thing.

myozka commented 3 years ago

My code for reporting part graph:

    int maxParts= -1, i = 1, j=0;
    for (i=1; i<=G->nVrtx; ++i) {
        if (maxParts< part[i])
            maxParts = part[i];
    }
    int** adj = (int**) calloc(maxParts+1, sizeof(int*));
    for (i=0; i<=maxParts; ++i)
        adj[i] = (int*) calloc(maxParts+1,sizeof(int));
    for (i=1; i<=G->nVrtx; ++i)
        for(j=G->outStart[i]; j<=G->outEnd[i]; ++j) {
            if (part[i] != part[G->out[j]])
                adj[part[i]][part[G->out[j]]] += G->ecOut[j];
        }
    for (i=0; i<=maxParts; i++) {
        for (j=0; j<=maxParts; j++) {
            if(adj[i][j] > 0)
                fprintf(file, "%d->%d [weight=%d, label=%d];\n", i, j, adj[i][j], adj[i][j]);
        }
    }

and a sample output I get:

0->1 [weight=4937, label=4937];
0->2 [weight=9054, label=9054];
0->3 [weight=982, label=982];
0->4 [weight=3, label=3];
0->5 [weight=699, label=699];
0->6 [weight=43, label=43];
0->7 [weight=262, label=262];
0->8 [weight=8, label=8];
0->9 [weight=1378, label=1378];
0->10 [weight=2521, label=2521];
0->11 [weight=1146, label=1146];
1->2 [weight=2379, label=2379];
1->3 [weight=953, label=953];
1->5 [weight=4635, label=4635];
1->6 [weight=82, label=82];
1->7 [weight=1477, label=1477];
1->9 [weight=35, label=35];
1->10 [weight=227, label=227];
2->3 [weight=3627, label=3627];
2->4 [weight=2177, label=2177];
2->5 [weight=647, label=647];
2->6 [weight=16, label=16];
2->7 [weight=224, label=224];
2->8 [weight=449, label=449];
2->9 [weight=13, label=13];
2->10 [weight=197, label=197];
2->11 [weight=391, label=391];
3->4 [weight=2247, label=2247];
3->5 [weight=2154, label=2154];
3->6 [weight=24, label=24];
3->7 [weight=3093, label=3093];
3->8 [weight=936, label=936];
3->9 [weight=1, label=1];
3->10 [weight=26, label=26];
3->11 [weight=1504, label=1504];
4->5 [weight=1912, label=1912];
4->6 [weight=1968, label=1968];
4->7 [weight=956, label=956];
4->8 [weight=1159, label=1159];
4->9 [weight=32, label=32];
4->10 [weight=723, label=723];
4->11 [weight=2764, label=2764];
5->6 [weight=2759, label=2759];
5->7 [weight=987, label=987];
5->9 [weight=2, label=2];
5->10 [weight=1805, label=1805];
5->11 [weight=338, label=338];
6->7 [weight=2918, label=2918];
6->8 [weight=1842, label=1842];
6->9 [weight=3711, label=3711];
6->10 [weight=1446, label=1446];
6->11 [weight=1036, label=1036];
7->8 [weight=3268, label=3268];
7->9 [weight=656, label=656];
7->10 [weight=54, label=54];
7->11 [weight=95, label=95];
8->9 [weight=4669, label=4669];
8->10 [weight=1, label=1];
8->11 [weight=1334, label=1334];
9->10 [weight=3537, label=3537];
9->11 [weight=2597, label=2597];
10->11 [weight=3346, label=3346];
cheshmi commented 3 years ago

Indexing was the major issue. Thank you so much, works well now.

cheshmi commented 3 years ago

Also, I don't see rMLGPschedule in my exe folder even though I could build and use the partitioner. Did I miss it or is it not included in the repo?

myozka commented 3 years ago

It is not in this repo at the moment. You can check this page for the source.

Again, you can contact me if you need help with that one as well.