GT-TDAlab / dagP

Multilevel Directed Acyclic Graph Partitioner
GNU Lesser General Public License v3.0
28 stars 5 forks source link

Error: In topsortPart, not every nodes are sorted: to = 0, nbpart = 2 #5

Closed phyzhenli closed 1 week ago

phyzhenli commented 1 year ago

Hi,

I encountered an error when using useapi. The program exited with the following outputs:

./useapi aig.dot 35

input: 
nbPart: 35

****** Error ******
In topsortPart, not every nodes are sorted: to = 0, nbpart = 2
*******************
Error in Execution

How should I solve it? Thanks for your help in advance. The attachment is the dot file.

dot.zip

myozka commented 1 year ago

Hello,

This error shows up when the partitioner cannot find an acyclic partition for any reason. It can't build the topological ordering, and reports that it could not topologically sort all vertices of the partgraph. (Later versions will have better error messages.)

For me, it works without a problem on Mac OS (gcc-11) but stochastically failed on Linux (gcc-10).

Note that the useapi.cpp file is only for demonstrating how to write your own .cpp file. It might be that it is unable to reliably find a valid partition with the given configuration in your case. You might want to explicitly select the partitioning algorithm variant and options.

I will take a closer look at this.

In the meantime, can you try running ./exe/rMLGP aig.dot 35 (multiple times) and see what it does?

Also, can you share your config.py file, compilation command, and any other relevant configuration info?

phyzhenli commented 1 year ago

Hi,

I continuously tested ./exe/rMLGP aig.dot 35 10 times and found that it does indeed fail stochastically (2/10 success).

My config.py:

import os
metis             = 1
# scotch            = 0
# debug             = 1
# build_threads     = 8
# debug_flags       = '-ggdb3'
# m64               = 1
CC                = 'gcc'
CXX               = 'g++'
LINK              = 'g++'
# defines            = '_DUMP_HYPERGRAPH'
# CCFLAGS            = '-Weverything -Wno-bad-function-cast -Wno-cast-qual -Wno-sign-conversion -Wno-padded -Wno-missing-prototypes -Wno-reserved-id-macro'
# CXXFLAGS          = '-std=c++11'
# LINKFLAGS         = ''
# extdefines        = ''
extincludes       = '/home/zli/Documents/PhD/AI_EDA/Partition_mcyang/submit/metis/include:'
extlibs           = 'metis'
extlibpath        = '/home/zli/Documents/PhD/AI_EDA/Partition_mcyang/submit/metis/build/Linux-x86_64/libmetis'

The OS is ubuntu 20.04 and the compiler is gcc-9.4.0.

myozka commented 1 year ago

With a quick look, it seems the initial partitioning choice is having trouble finding a valid partitioning that adheres to the weight balance constraint and fail occasionally. Current code does not allow returning a partitioning result that is not within given constraints. We can relax that and make it best-effort instead. I plan to have a closer look later on and consider the best options for when this happens.

In the meantime, I can suggest you some parameter alternatives to help you continue with your own project. That is, if you are not strictly interested in the best performance of the partitioner or the performance of the default parameter settings and balance constraint.

If 35 parts is not strictly required and you can get away 32 parts.. It works much better with powers of two, since it is a recursive bisection based algorithm.

Otherwise, most likely, you should be fine with using opt.conpar=0 and opt.inipart=11 options.

Please let me know and we can discuss what alternatives you can try.

phyzhenli commented 1 year ago

Thanks for your suggestions! I set opt.conpar=0 and opt.inipart=11. All benchmarks are PASSED.