stingergraph / StingerGraphs.jl

Julialang bindings to the STINGER graph database
http://www.stingergraph.com
Other
5 stars 3 forks source link

edgeweight function much slower than edgeparse #38

Open edward-kao opened 6 years ago

edward-kao commented 6 years ago

Not sure this is an issue, but I noticed that retrieving edges weights using the edgeweight(g, src, dest, etype) function is significantly slower (~20 times) than just parsing the edge (i.e. edgeparse(edge)). Perhaps I am not retrieving the edge weight the right way? I am doing this on directed and weighted graphs with 1-10M edges. Thank you!

jpfairbanks commented 6 years ago

@rohitvarkey, is this 20x slowdown consistent with the C call overhead, or is it due to the fact that edgeweight looks at all the edges of a vertex to find the edge, while edgeparse already has the edge and is just extracting the weight?

rohitvarkey commented 6 years ago

I think the latter is probably the right explanation. The Julia C overhead is pretty small. But the ccall to stinger_edgeweight ends up traversing all the edges of a vertex to find the edge rather than just reading the value off an edge already in memory. With more edges, the overhead will be more evident as the C function has to traverse more edges in the stinger_edgeweight function.

edward-kao commented 6 years ago

Thank you for the explanation. Given the source and destination vertices of an edge, does Stinger have a function to retrieve its weight without traversing through all the edges of a vertex? If so, making it available in the Julia-Stinger would be highly desirable!

davidediger commented 6 years ago

This is unfortunately one area where STINGER performance is going to poor (at least on high-degree vertices). STINGER was designed with traversal in mind, so it's database-like query features are not great. Fixing this would require either a second index over the edges (prohibitively expensive) or a re-design of the linked list of blocks adjacency structure. We have had discussions about how we might approach this, but nothing has been prototyped yet.

jpfairbanks commented 6 years ago

Right, the design of stinger is one based on implementing high performance parallel algorithms, which means that any operation on one edge is not optimized.

What is the circumstance surrounding your need to get a single edge out of the stinger? Can we restructure the higher level algorithm to avoid this operation?

edward-kao commented 6 years ago

I am considering the cases for streaming updates. For example, if the edge weight represents the number of interactions between two specific vertices, one may want to increment that weight when new interactions are observed. That will likely involve access and modification of the weight of a specific edge.

In some cases though, the updates will involve all the edges to and from a certain vertex. Is there a fast way to retrieve the weights of all such edges? I am using the foralledges iterator to access each edge of a certain vertex now, but it's not clear how I can get their weights.

rohitvarkey commented 6 years ago

I am considering the cases for streaming updates. For example, if the edge weight represents the number of interactions between two specific vertices, one may want to increment that weight when new interactions are observed. That will likely involve access and modification of the weight of a specific edge.

I was looking through the stinger C source and it seems like there is another C function, called stinger_incr_edge that increments the edge weight passed in rather than set it like stinger_insert_edge (which is called fom Julia currently) does. I should be able to add an interface for you to use that function if you'd like. However, this C function also has to iterate through all the edges of the vertex before being able to update it.

In some cases though, the updates will involve all the edges to and from a certain vertex. Is there a fast way to retrieve the weights of all such edges? I am using the foralledges iterator to access each edge of a certain vertex now, but it's not clear how I can get their weights.

The following will work to load the edges and the edge weights into Julia. But you will not be able to update the weights in the C memory as it the edge has been loaded into Julia already.

julia> s = Stinger()
StingerGraphs.Stinger(Ptr{Void} @0x00000001e4e7b000)

julia> for i=1:5
           insert_edge!(s, 0, 0, i, i, 0)
       end

julia> foralledges(s, 0) do e, src, etype
           @show e.weight
       end
e.weight = 1
e.weight = 2
e.weight = 3
e.weight = 4
e.weight = 5

Another way to solve this for fast updation of all vertices would be write your own foralledgesupdate function. I've written up a quick implementation in #40.

ehein6 commented 6 years ago

stinger_batch_incr_edges is the way to go here. You provide a list of edges to update and it applies all of them in parallel, adding up the weights. It batches up the updates so that each neighbor list is only traversed once. I think we held off on providing a wrapper because this function is implemented using C++, which is harder to interface with Julia.

edward-kao commented 6 years ago

Yes, the e.weight allowed me to access the edge weights during a traversal, which is fast. I noticed the weights are reported in one direction but not the other and opened up a separate issue on this. I'll follow up on the batch update in #40