Two sets of edges - Githubissues

DanqingZ commented 8 years ago

Hi all,

I have a general question about SnapVX.

Currently I have a social network with location information of each individual, which mean apart from the social network, I also have neighborhood based on geographical information. My assumptions of the model is that: the people who are connected have similar coefficients, the people who are in the same neighborhood have similar coefficients.

That means I have two sets of edges based on current nodes, then I can add two sets of network lasso terms.I am wondering if current SnapVX package can handle this problem?

My current implementation is that I add that I construct graph based on social network and first add edge constraints:

L =0.01
#0.004
def laplace_reg(src, dst, data):
    return (L*norm(src['x'] - dst['x']), [])

gvx.AddEdgeObjectives(laplace_reg)

Add then loop within neighborhood to add neighborhood effect:

for i in range(len(c2)):
    for j in range(i+1,len(c2)):
        src_id=i
        dst_id=j
        (src_vars, dst_vars) = (gvx.GetNodeVariables(src_id), gvx.GetNodeVariables(dst_id))
        L_2 = 0.02
        edge_obj=  L_2*square(src_vars['x']-dst_vars['x'])
        # gvx.SetEdgeObjective(src_id,dst_id,edge_obj)
#         print 'yes!'
        gvx.AddEdge(src_id, dst_id,Objective=edge_obj)

But it seems but implementation is not working anyway:

Combined backtracking failed 90 0 0 0 sigma 1
Combined backtracking failed 90 0 0 0 sigma 1
Combined backtracking failed 90 0 0 0 sigma 1
Combined backtracking failed 90 0 0 0 sigma 1
Combined backtracking failed 90 0 0 0 sigma 1

---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-42-cbb022e6b30a> in <module>()
----> 1 gvx.Solve(Verbose=True, Rho=0.1)

/Users/danqing0703/anaconda/lib/python2.7/site-packages/snapvx.pyc in Solve(self, M, UseADMM, NumProcessors, Rho, MaxIters, EpsAbs, EpsRel, Verbose, UseClustering, ClusterSize)
    146         if UseADMM and self.GetEdges() != 0:
    147             self.__SolveADMM(NumProcessors, Rho, MaxIters, EpsAbs, EpsRel,
--> 148                              Verbose)
    149             return
    150         if Verbose:

/Users/danqing0703/anaconda/lib/python2.7/site-packages/snapvx.pyc in __SolveADMM(self, numProcessors, rho_param, maxIters, eps_abs, eps_rel, verbose)
    440             if os.name != 'nt':
    441                 pool.map(ADMM_x, node_list)
--> 442                 pool.map(ADMM_z, edge_list)
    443                 pool.map(ADMM_u, edge_list)
    444             else:

/Users/danqing0703/anaconda/lib/python2.7/multiprocessing/pool.pyc in map(self, func, iterable, chunksize)
    249         '''
    250         assert self._state == RUN
--> 251         return self.map_async(func, iterable, chunksize).get()
    252 
    253     def imap(self, func, iterable, chunksize=1):

/Users/danqing0703/anaconda/lib/python2.7/multiprocessing/pool.pyc in get(self, timeout)
    565             return self._value
    566         else:
--> 567             raise self._value
    568 
    569     def _set(self, i, obj):

Exception: The 'minimize' objective must resolve to a scalar.

davidhallac commented 8 years ago

We unfortunately don’t have an elegant way to incorporate two sets of edges to the same set of nodes, but I do think there are workarounds you can use.

Your code will overwrite any previous edges between those two nodes, so that won't work. Instead, if edge_ij has both types (social and spatial), you'll have to add them simultaneously - i.e. something like gvx.AddEdge(i,j,Objective=L1_norm(src['x']-dst['x']) + L2_square(src['x'] - dst['x']).

It might take a bit longer to set up the problem, but scalability-wise it shouldn't affect the runtime.

As for the combined backtracking error, it might go away once you change the code. If not, it depends on your optimization problem, but it looks like it might be due to this: https://github.com/embotech/ecos/issues/134 (ECOS is the underlying solver we call for the ADMM subproblems). If that continues being an issue, let us know and I can help you call another solver.

DanqingZ commented 8 years ago

what is i and j in gvx.AddEdge(i,j,Objective=L1norm(src['x']-dst['x']) + L2square(src['x'] - dst['x'])? It seems they node index here, but src in src['x'] is the node class. how could I return to node class when I just have node index? just go over snap.py,but could not find one...

DanqingZ commented 8 years ago

This is the current code I have, but stuck on the step how to add objectives and constraints of edges through a loop

num_nodes = 500
temp = GenRndDegK(num_nodes, 3)
gvx2= TGraphVX()

for i in range(500):
    W = Variable(3,name='W')
    b = Variable(1,name='b')
#     obj = logistic(-y[i]*x.T*A[i,:])
    obj = logistic(A[i,:].T*W+b-y[i])
    gvx2.AddNode(i, Objective=obj)

L1 = 0.02
for EI in temp.Edges(): 
    count = count +1
    if count % epoch ==0:
        print "edge (%d, %d)" % (EI.GetSrcNId(), EI.GetDstNId())
    a = EI.GetSrcNId()
    b = EI.GetDstNId()
    gvx2.AddEdge(a,b,Objective=L1*norm(a['b']-b['b']),Constraints=[a['W']-b['W']])

edges = []
for EI in temp.Edges(): 
    count = count +1
    if count % epoch ==0:
        print "edge (%d, %d)" % (EI.GetSrcNId(), EI.GetDstNId())
    edges.append([EI.GetSrcNId(), EI.GetDstNId()])

count = 0
for i in range(500):
    if [i,i+1] not in edges and [i+1,i] not in edges:
        count +=1
        j = i+1
        gvx2.AddEdge(i,j,Constraints=[i['W']-j['W']])

print count

DanqingZ commented 8 years ago

when I just have the node_id, I do not know how to get node variables to use them in edge objectives and edge constraints. That is exactly my question.

davidhallac commented 8 years ago

To get the node variable from the node id, you use gvx.GetNodeVariables(node_id). It will return a dictionary of 'var_name':variable, so to get the variable 'x' from node 3, you would do

temp = gvx.GetNodeVariables(3) varX = temp['x']

snap-stanford / snapvx

Two sets of edges #19