usc-cloud / goffish

USC GoFFish Graph Analytics Framework
32 stars 11 forks source link

Mutable API and imlementation for SubgraphInstance #79

Open simmhan opened 11 years ago

simmhan commented 11 years ago

It is often necessary to collect aggregate stats across multiple subgraph instances within user code to store and process analytics results, and also to export the final results. Having an IMutableSubgraphInstance interface that has "set" methods corresponding to the "get" methods in ISubgraphInstance, with an implementation of the same, will allow users to use it for storing state/aggregation across subgraph instances. The constructor should take in a ISubgraph template object so that the topology of the immutable ISubgraphInstance can be duplicated. NOTE: this subgraph's properties are only mutable in memory and have no requirement for write back to GoFS. It is just a convinience object. It also helps us "export" the subgraphinstance to JSON using the ISubgraphInstance interface.

e.g. class MutableSubgraphInstance implements ISubgraphInstance, IMutableSubgraphInstance { // inherit topology from template. define user properties that can be set/changed. MutableSubgraphInstance(ISubgraphTemplate template, VertexPropertyList userVertexProperties, EdgePropertyList userEdgeProperties) {

}

// inherit topology and also a snapshot of the current property values for the instance. pass the names of properties in instance that have to be copied over. This could be a copy-on-write. 
MutableSubgraphInstance(ISubgraphInstance initialInstanceValues, VertexPropertyList vertexPropertiesToCopy, EdgePropertyList edgePropertiesToCopy) {

}

}

* Usage for connected components * // load template ISubgraph sg1 = Partition.load(sgid1); // create a mutable instance based on this remplate with one vertex property to hold the vertex color IMutableSubgraphInstance musg1 = new MutableSubgraphInstance(sg1, {"vertex_color":int}, {});

// traverse template topology // update vertex color property in musg1 as you traverse

// print result of musg1 having vertex colors jsonEmitter.print(musg1 as ISubgraphInstance);

sooniln commented 11 years ago

I'm still trying to get a good handle on exactly what's needed. So an instance is essentially a map of edge/vertex ids to values for a various properties. is this trying to assign values to vertices/edges that don't already have a value? or is this for creating a new property? In which case, how is this different from Map<Long, Color> for the example above?

simmhan commented 11 years ago

There are two goals here:

  1. Use the same subgraph instance API mechanism to add new properties or modify a copy of existing properties in memory. While the users can always create a Map from the vertex/edge ID to the property values, their code needs to use both the GoFS subgraph instance and their own Map abstractions and switch between them. E.g. I have a function that is called on different subgraph instances and performs some incremental calculation. It is having to return my custom Map objects along with a template rather than a subgraph instance that can be passed back to be in the next iteration (e.g. func Add(SG1, SG2) -> SG3). It is easier if we provide an interface that can be used uniformly.
  2. We’re already building tooling using the subgraph instance API to, say, export to JSON. I’d like to reuse such tools for graph instance objects that users create/modify rather than rebuild such tooling. E.g. I may want to average a vertex property’s value across time and export it as JSON to be rendered by a viz tool. If we can create a mutable graph instance to store these results, I could reuse the JSON exporter.

--Yogesh


Yogesh Simmhan | mailto:simmhan@usc.edu simmhan@usc.edu | http://ceng.usc.edu/~simmhan ceng.usc.edu/~simmhan | skype skype:simmhan simmhan | cel tel:+15404494770 +1 (540) 449 4770

From: Soonil Nagarkar [mailto:notifications@github.com] Sent: Saturday, July 20, 2013 11:11 AM To: usc-cloud/goffish Cc: Yogesh Simmhan Subject: Re: [goffish] Mutable API and imlementation for SubgraphInstance (#79)

I'm still trying to get a good handle on exactly what's needed. So an instance is essentially a map of edge/vertex ids to values for a various properties. is this trying to assign values to vertices/edges that don't already have a value? or is this for creating a new property? In which case, how is this different from Map for the example above?

— Reply to this email directly or view it on GitHub https://github.com/usc-cloud/goffish/issues/79#issuecomment-21297776 .