Closed nithinmanoj10 closed 1 year ago
Since compute-intensive operations are frequently a part of graph analytics, GPUs have been widely used to speed up processing. However, the representative graphs in many applications change frequently, necessitating a rebuild of the graph structure on GPUs to account for the updates. As a result, the bottleneck of processing high-speed graph feeds is reconstructing the graphs.
Preliminary research suggests that the cache-insensitive data structure known as the Packed Memory Array (PMA) may be used to maintain dynamic graphs on GPUs. In order to enable quick updates with a constant bounded gap ratio, PMA was initially developed for CPUs. However, PMA does not support parallel insertions, and applying PMA to streaming updates is difficult because updates frequently arrive in batches rather than one at a time. A GPU-oriented parallel algorithm has been devised to support efficient updates against high-speed graph streams.
Compared to frameworks like Stinger, which runs on a 40-core CPU server, GPMA outperforms CPU-based approaches because of optimizations that take advantage of the GPUs. One of the main reasons is that because many graph applications are data and compute-intensive, GPMA may take advantage of the GPU's exceptionally high memory bandwidth and massive parallelism. Second, GPMA is significantly more efficient because it avoids the expensive process of rebuilding the graph via incremental updates while introducing minimal overheads for existing graph algorithms running its graph structure.
We have built a PyBind11 wrapper over the C++ implementation of GPMA to expose it as a Python module that Seastar can use to store dynamic graph datasets efficiently.
It is a new dynamic sparse graph representation called Packed Compressed Sparse Row (PCSR), based on an array- based dynamic data structure called the Packed Memory Array. PCSR is similar to CSR, but leaves spaces between elements, allowing for asymptotically faster insertions and deletions in exchange for a constant factor slowdown in traversals and a constant factor increase in space overhead.
We find that for slightly more storage and query time, it is able to achieve similar mutability speeds to the adjacency list. CSR was unable to handle many inserts in a reasonable amount of time. PCSR was orders of magnitude faster for inserts and updates than CSR and adjacency list while maintaining similar graph traversal times. The growth of social networks and other changing graphs necessitates the need for efficient dynamic graph structures. PCSR is a basic dynamic graph storage format that can fit into existing graph processing frameworks and support fast insertions with comparable traversal times. Future work includes implementing deletes, other graph algorithms such as shortest-paths, and PCSR in external memory. Additionally, PCSR can be sped up with a parallel implementation.
With the help of PyBind11, we were able to create a wrapper around the C++ implementation of PCSR to expose it as a Python module that can be used by Seastar.
To test our novel implementation against state-of-the-art GNN systems, we evaluated Seastar against PyTorch Geometric-Temporal (PyG-T). We will evaluate them based on their performance and their space usage.
For benchmarking, we have created a synthetic temporal graph dataset generator named Boorah. The user can specify the number of nodes and an edge multiplier to set the sparsity of the graph. We have decided to use the following graph dataset with different node counts and edge sparsity.
<--------Enter Table of Boorah Datasets --------->
Hardware System We have conducted the experiments of the following hardware
Things to add to our mini final report