tapaswenipathak / linux-kernel-stats

linux kernel stats (Publication [Journal, Magazine]). This repository has code files.
MIT License
4 stars 8 forks source link

Summary - PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs #87

Closed reddheeraj closed 1 year ago

reddheeraj commented 1 year ago

https://www.usenix.org/system/files/conference/osdi12/osdi12-final-167.pdf

reddheeraj commented 1 year ago

PowerGraph

The authors' Joseph E. Gonzalez, Yucheng Low, Haijie Gu, Danny Bickson, and Carlos Guestrin examine the difficulties in handling large-scale graph-based computations, particularly with real-world graphs that have skewed power-law degree distributions. They propose the PowerGraph abstraction, which addresses these issues by exploiting the internal structure of graph programs to optimize distributed graph placement and representation, providing analysis and evaluation of PowerGraph compared to other common graph-parallel systems, which showed significant improvements in performance. PowerGraph addresses the limitations of existing systems in dealing with large-scale graph-structured data in machine learning and data mining (MLDM). PowerGraph exploits the structure of vertex, resulting in greater parallelism, reduced network communication and storage costs, and a new approach to distributed graph placement.

Pregel is a graph-parallel abstraction for MLDM that encodes computation as vertex-programs that run in parallel and interact along edges in the graph. It uses a bulk synchronous message-passing model and introduces user-defined functions called message combiners that help reduce communication overhead. GraphLab is an asynchronous distributed shared memory abstraction in which vertex-programs have shared access to a distributed graph with data stored on every vertex and edge. GraphLab ensures serializability by preventing neighboring program instances from running simultaneously. Natural graphs, like social networks, have a skewed power-law degree distribution where a small fraction of vertices are adjacent to a large fraction of edges. This causes work imbalance, poor partitioning, communication bottlenecks, storage issues, and limits scalability in distributed graph-parallel computation. These challenges are specific to the skewed degree distribution and are not fully characterized by other natural graph models.

PowerGraph is a new graph-parallel abstraction that addresses these challenges on power-law graphs. It combines features from Pregel and GraphLab, and supports both the highly-parallel bulk-synchronous Pregel and GraphLab model of computation. The PowerGraph engine supports both synchronous and asynchronous execution. The asynchronous mode can have non-determinism behavior, complicating algorithm design and debugging, and can lead to instability or divergence for some algorithms. GraphLab addresses these challenges by enforcing serializability and by using a fine-grained locking protocol which is unfair to high-degree vertices. PowerGraph addresses these limitations by introducing a new parallel locking protocol that is fair to high-degree vertices and by exposing more fine-grained (edge-level) parallelism. PowerGraph combines data from adjacent vertices and edges, with constraints on accumulator size, and apply function complexity to support natural graphs.

PowerGraph helps to minimize the number of replicas in the graph and thus reduces storage and communication requirements. An algorithm for improving upon randomly constructed vertex cuts by de-randomizing the edge-placement process was proposed.

Abstraction Comparision This paper evaluates the computation and communication imbalances in PageRank across three graph abstractions (GraphLab, PowerGraph, and Pregel) on five synthetic power-law graphs with α ranging from 1.8 to 2.2 and ten million vertices. The experiment was conducted on an eight-node Linux cluster. The results show that PowerGraph eliminates the sequential dependence of computation, making it immune to work imbalance caused by low α. Pregel experiences higher work imbalance on fan-out graphs and communicates more than PowerGraph and GraphLab on fan-out graphs. The communication volume of GraphLab and PowerGraph is invariant to fan-in and fan-out. PowerGraph performs best in terms of communication and computation, with greedy partitioning providing a 25-50% improvement in runtime.

The results show that PowerGraph is more robust to work and communication imbalance than Pregel and GraphLab, particularly when dealing with highly-skewed graphs. The article also discusses the implementation and evaluation of three different engines for graph-parallel computation: the synchronous engine, the asynchronous engine, and the asynchronous serializable engine. The engines demonstrate good scalability and computational efficiency.

Brief Implementation Data for the graph is loaded from text files in a distributed file system (HDFS) in parallel in all instances. The PowerGraph engine has two implementations: synchronous and asynchronous. The synchronous implementation runs on a single multithreaded instance and has deterministic execution. The asynchronous implementation uses a state machine for each vertex and assigns cores to active vertices to hide communication latency. It shows moderate increases in throughput with improved partitioning and delta caching.

The Async engine provides high performance for various tasks but is difficult to program. The Async+S engine was created to enforce serializability, solving the dining philosophers' problem in graph-parallel computation. The Chandy-Misra solution was implemented for a high degree of parallelism and extended to the vertex-cut setting. It was tested on graph coloring and the ALS algorithm and was found to perform more uniformly, converge faster, and be necessary for serializability in many applications. The complexity of the Async+S engine is justified by the need for serializability and predictability in nondeterministic asynchronous execution.

The PowerGraph system handles fault tolerance by saving snapshots of the data-graph during execution, which can be done either between super-steps or during execution. PowerGraph outperforms other systems in PageRank and Triangle Counting tasks, and is comparable in performance to the state-of-the-art solution in the Latent Dirichlet Allocation (LDA) task, with only 200 lines of code.

Related Work The paper discusses the vertex-cut approach to distributed graph placement, which is related to work in hypergraph partitioning. However, the authors note that existing hypergraph partitioning methods can be time-intensive. In this paper, the authors focus on the novel concept of a "streaming vertex-cut." In comparison to other graph-parallel abstractions, the authors discuss Pregel and GraphLab in detail. They also mention other related graph-parallel frameworks, including BPGL, Kineograph, and a distributed graph database presented by Pujol et al., but note that these do not explicitly consider power-law structure. Finally, the authors mention the impressive performance of GraphChi, an efficient single-machine disk-based implementation of the GraphLab abstraction. GraphChi is able to significantly outperform large Hadoop deployments on many graph problems while using only a single machine. The authors suggest that the techniques used in GraphChi could be used to add out-of-core storage to PowerGraph.

To conclude, the authors developed the PowerGraph abstraction to address challenges in partitioning and balancing work for graphs with power-law degree distributions. PowerGraph uses the Gather-Apply-Scatter model and vertex cuts to reduce storage and communication costs and improve scalability and efficiency on large-scale problems.

KavitaMeena23 commented 1 year ago

mention the following in your summary:

  1. mention abstraction comparison (section 6)
  2. related work (section 8)
  3. brief implementation (subsections under sec 7)
KavitaMeena23 commented 1 year ago

Reviewed

duttabhishek0 commented 1 year ago

@tapaswenipathak