support grouping instance when writing slices

simmhan commented 11 years ago

Is this to do with the case where we have M's of subgraphs per partition (that we may have to group into a single slice), or a different issue?

sooniln commented 11 years ago

This will help a bit, as it can artificially reduce the effective number of instances. Ie, if we group every 2 instances together, it would cut the # of effective instances in half, ie the # of slice in half. The problem is that when # of slices = # subgraphs * # properties * # instances it seems that # subgraphs will be by far the dominating term. So this will help a bit, but not nearly as much as reducing the # of subgraphs.

simmhan commented 11 years ago

I’m wondering if we should instead group small subgraphs together in a single slice. The programming model we’re using incrementally operates over time, one time period for a superstep, but like tries to access all subgraphs at a time in a superstep (though on different threads). So having the same time duration in a slice but including different smaller subgraphs will help this model. However, if the user operates on subgraphs incrementally within a superstep (e.g. due to memory constraints), then there may be a issue since we may end up reloading the same slice to serve different subgraphs. Maybe in such a case, we should provide an iterator over the subgraphs rather than a list. This will allow us to make the most use of the slices that have been loaded.

Yogesh Simmhan | mailto:simmhan@usc.edu simmhan@usc.edu | http://ceng.usc.edu/~simmhan ceng.usc.edu/~simmhan | skype skype:simmhan simmhan | cel tel:+15404494770 +1 (540) 449 4770

From: Soonil Nagarkar [mailto:notifications@github.com] Sent: Tuesday, July 9, 2013 4:27 PM To: usc-cloud/goffish Cc: Yogesh Simmhan Subject: Re: [goffish] support grouping instance when writing slices (#66)

This will help a bit, as it can artificially reduce the effective number of instances. Ie, if we group every 2 instances together, it would cut the # of effective instances in half, ie the # of slice in half. The problem is that when # of slices = # subgraphs * # properties * # instances it seems that # subgraphs will be by far the dominating term. So this will help a bit, but not nearly as much as reducing the # of subgraphs.

— Reply to this email directly or view it on GitHub https://github.com/usc-cloud/goffish/issues/66#issuecomment-20712596 .

simmhan commented 11 years ago

Same as #70

usc-cloud / goffish

support grouping instance when writing slices #66