open-connectome-classes / StatConn-Spring-2015-Info

introductory material
18 stars 4 forks source link

Graph Matching for n>=1000 #213

Open DSP137 opened 9 years ago

DSP137 commented 9 years ago

It was said that running FAQ on graphs with 1000 or more vertices is still relatively slow. I'm wondering, are there any algorithms that are built to do graph matching for a larger number of vertices? i2g mentioned a way to approximate a graph's connections via algorithms. Would it be possible to modify this to work for graph matching in some way, and would we be able to give a bound on the potential error that solely using algorithms to determine the matching might cause.

jtmatterer commented 9 years ago

From what I've been told by Joshua, Carey, and Vince, LSGM is the first graph matching algorithm that can run in a reasonable amount of time for large graphs and still return (provably) good results.

ElanHR commented 9 years ago

I'd imagine there are a number of problems with n>=1000 such that you could use multiple rounds of graph matching to get decent results. (First match a lower resolution/clustered graph and then match the graphs of the individual clusters)

imichaelnorris commented 9 years ago

I feel like if you broke the graph into pieces and ran the algorithm on them then ran it on the full graph once you assigned some of the smaller clusters, that it wouldn't have a speed up. Can the resolution be increased on successive runs of the algorithm and still be able to match the same graph?

jtmatterer commented 9 years ago

@ElanHR I think your method would be highly sensitive to errors in the initial partitioning of the graphs. No matter how good the graph matching algorithm you use on each subgraph, if vertices of a graph don't have a partner in the matched subgraph, they're going to be matched to the incorrect vertex (and that vertex is also going to be mismatched, etc.)

Also, the strategy you described also encompasses LSGM which clusters the joint ASE. This is one of the reasons why LSGM is defensible, because under the stochastic blockmodel the block memberships can be perfectly recovered from this clustering technique with positive probability tending to 1 as the size of the graph tends to infinity.

whock commented 9 years ago

I think @ElanHR 's idea is interesting but would clustering the set of nodes into lower resolution set nodes_c introduce errors? Inevitably there will be nodes in the original set that do not have a 1:1 correspondence in the graph that you're trying to match with. Would this clustering amplify that issue? I think @jtmatterer hits on the same idea.

wrgr commented 9 years ago

Just a note - i2g uses image processing on the underlying image data. So these algorithms (image processing, cv) are of a different type, I think. If not, I can tell you how they work.