Is your feature request related to a problem? Please describe. Related: https://github.com/rapidsai/cugraph/issues/187

class Graph (defined in c_graph.pyx) stores a graph and provides functions to initialize, delete, convert (between different graph formats), and query.

initialize: add_edge_list, add_adj_list
delete: delete_edge_list, delete_adj_list, delete_transposed_adj_list
convert: add_transposed_adj_list, (view_edge_list, view_adj_list, num_vertices).
query: view_edge_list, view_adj_list, num_vertices

There are corresponding C++/CUDA level functions.

initialize: gdf_edge_list_view, gdf_adj_list_view
delete: gdf_delete_edge_list, gdf_delete_adj_list, gdf_delete_transposed_adj_list
convert: gdf_add_edge_list, gdf_add_adj_list, gdf_add_transposed_adj_list
query: N/A

There exist some inconsistencies in naming, and some functions may surprise users. In general, surprising users is bad in API design. If a user expects a function to do A, but the function silently does B, or (B + A), it may leads to a hard-to-fix bug till the user finds that the function is doing something unexpected.

So inconsistencies. In the python level, add_edge_list and add_adj_list initialize a graph while add_transposed_adj_list performs conversion. In CUDA/C++, gdfadd..._list performs conversion but add_edge_list and add_adjlist perform initialization in python. In CUDA/C++, gdf..._list_view functions perform initialization. Calling both add_edge_list and add_adj_list can result in a single graph instance storing two different graphs (issue #187). And we have num_vertices, why no num_edges?

And the surprise part. view_edge_list, view_adj_list, and num_vertices implicitly perform conversion. This requires more thoughts as there is a design trade-off. Implicit (or automatic) conversion allows users to type less. But this can require a significant amount of computing and also more memory for large graphs. It can be quite surprising for users to see memory allocation error after long wait when they simply query the number of vertices.

Describe the solution you'd like So we may clarify the API functions for initialization, deletion, conversion, and query and also need to fix inconsistencies between the python API and C++/CUDA API.

For initialization:

Option 1) initialize in instance creation (similar to cudf dataframe or series). e.g. G = cugraph.Graph(col_0, col_1, col_2, format) # col_0 and col_1 do not sound like good names but col_0 and col_1 can have different meanings in COO and CSR
Option 2) Initialize using view e.g. G = cugraph.view_edge_list(source_col, dest_col, value_col) # isn't weight_col a better name than value_col? and we should we use edge_list, adj_list, transposed_adj_list or COO, CSR, and CSC?

For deletion: What we have looks OK to me...

For conversion:

Option 1) use add_..._list # add implies that we are adding an additional representation from an existing representation.
Option 2) I thought about convertto..._list but this somewhat implies that we are replacing the existing format.

For query:

Option 1) Leave as is.
Option 2) Raise an API error if querying requires conversion.
Option 3) Perform conversion but with a warning message.

Describe alternatives you've considered Any comments and thoughts (and new options) will be appreciated.

I would like to mirror the comments by @seunghwak. In addition, coming in as a NetworkX user, I was also surprised by the API design. Being surprised by the API is a red flag for me as an end-user, and a tutorial leader for NetworkX-based tutorials (I have a number coming up over the next few months).

Having seen a tweet on Twitter, I had quite a number of high hopes regarding the API. In particular, because NetworkX has been so idiomatic for learning and doing graph analytics, I thought it would be super awesome if cuGraph just outright copied the NetworkX API, and the team followed-up with PRs to NetworkX to suggest improvements to the NetworkX API (and sync it up) where the nx API might not be the best. In particular, if the API were switched over to explicitly mirror the NetworkX API, I could very easily recommend using it as the solution for scalability when I do my NetworkX tutorials at conferences. Scaling up is a common question for NetworkX users, and I have traditionally recommended iGraph with some hesitation, because the API is different. As it stands, however, cuGraph is not a solution I would be ready to recommend - the API is very different from NetworkX's API.

Graph creation, in particular, is a great example of where I was surprised when I read the cuGraph API. Reading the docs is when I realized that cuGraph is instead not a drop-in replacement for NetworkX.

In NetworkX, the following API calls are used:

import networkx as nx

G = nx.Graph()
G.add_edges_from(...)  # add a set of edges to the graph
G.add_edge(...)  # add an edge to the graph
G.add_nodes_from(...)   # add a set of nodes to the graph
G.add_node(...)  # add a node
G.nodes  # get view of nodes
G.edges  # get view of edges

In order to make cuGraph a drop-in replacement for NetworkX, at the very minimum, those same class methods should be present.

In addition, in NetworkX, "edge list" is the term used for a list of two- or three-tuples that are passed in to create edges, with nodes automatically created. "adjacency", from my memory, as a term it is reserved the adjacency matrix. This separation of vocabulary makes it much easier for an end-user to reason about what exactly we're requesting for from the graph.

I'm happy to chat further, and even help out with input on where to match the API for end-users. The underlying implementation is one for which I'm less concerned about (though important, for reproducibility/accuracy/"science etc."), but the API really matters.

The RAPIDS team has done a great job already with cuDF copying the pandas API, and CuPy also has been awesome with its explicit "copy NumPy" policy. cuGraph actually copying NetworkX's API would make this a super awesome project too!

Thank you very much @ericmjl for the detailed comments. This makes a lot of sense of to me, and we greatly appreciate your feedback and this will be really helpful for us to improve cuGraph!!!

My pleasure, @seunghwak! Please do let me know if I can help in some way. I'll be keeping an eye out here! :smile:

Thanks a lot, and we'll post the proposed updates in this thread before jumping to implementation, and we'll greatly appreciate any feedbacks!!!

We drafted an initial plan to better match the NetworkX API and summarized design issues that need further discussion. Now this discussion is moved to github, and any feedback or suggestions will be greatly appreciated.

cuGRAPH adding an API Layer to match NetworkX

Design Objectives
Major Challenges and Design Decisions
Class Graph
Graph Generation 4.1. R-mat graph generation
Algorithms 5.1. Breadth-first-search 5.2. Jaccard 5.3. Louvain 5.4. Overlap Coefficient 5.5. Pagerank 5.6. Spectral Clustering 5.7. Single-source-shortest-path
References

1. Design Objectives

Ease of migration from NetworkX: NetworkX is widely used for graph analytics but slow in processing large graphs. We need to minimize the migration cost from NetworkX to promote cuGraph as the scalability solution for NetworkX users.
Interoperability with other RAPIDS projects: RAPIDS aims to minimize data migration and conversion overhead in complex data analytics, and RAPIDS is built on top of a columnar data structure called a GPU DataFrame to achieve this goal. cuGraph should follow this design decision for efficient communication with other RAPIDS projects.
Other considerations: We consider cuGraph as RAPIDS’s compute engine for graph analytics. We may not store graph, node, edge attributes that are not directly used by cuGraph algorithms inside a cuGraph object; the node attribute “bipartite” and the edge attribute “weight” are exceptions as some cuGraph algorithms require these attributes.

2. Major Challenges and Design Decisions

cuGraph is the compute engine for graph analytics for the bigger RAPIDS project. We do not store all the auxiliary data that may not be used by cuGraph algorithms inside cuGraph objects. NetworkX stores graph, node, and edge attributes inside NetworkX objects whether they are used by NetworkX algorithms or not. We attempt to follow the NetworkX API wherever possible but we need to deviate from the NetworkX API in several places due to this difference; cuGraph is more restrictive in supporting graph, node, and edge attributes.

3. Class Graph

NetworkX & cuGraph comparison

NetworkX has separate classes for different graph types: Graph for undirected graphs, DiGraph for directed graphs, MultiGraph for undirected multigraphs, and MultiDiGraph for directed multigraphs. cuGraph has a single Graph class.
NetworkX supports varying numbers of attributes for graphs, nodes, and edges. cuGraph supports only edge weights (we plan to add the node attribute “bipartite”).
Graph updates are more expensive in cuGraph than NetworkX.

Design Issues

We store only the attributes directly used by cuGraph algorithms inside cuGraph objects; cuGraph objects can store the node attribute “bipartite” and the edge attribute “weight” but other attributes are stored outside cuGraph objects.
How should we support graph updates? 1) Batch updates; 2) update the underlying data structure for efficient dynamic graph updates; 3) or disallow graph updates (in this case, users need to update graphs outside cuGraph and explicitly initialize again). In short term, we plan to support graph construction on initialization and batch updates only.
Multigraph support.
Graphs are initialized with edge data. How should we support the node attribute “bipartite”?
Should we expose multiple formats (CSR, CSC, COO) to users? For example, cuGraph currently has delete_edge_list(), delete_adj_list(), and delete_transposed_adj_list(). NetworkX has only a single clear() function.

API Proposal

Initialization
- cugraph.Graph.__init__(incoming_graphdata=None, weight=’weight’) (following networkX.Graph.__init_\()): incoming_graph_data is an edge list or any cuGraph graph object. NetworkX takes a list of tuples as an edge list. cuGraph should take a dataframe with multiple columns instead (two columns for sources and destinations, an optional column for edge weights). We may consider supporting a list of tuples and NetworkX graph objects in the future to improve NetworkX compatibility (assuming that cuDF provides an efficient mechanism to convert a list of tuples or python dictionaries to a columnar GPU DataFrame).
- cuGraph.Graph.add_edges_from(ebunch_to_add) (following networkX.Graph.add_edges_from()): ebunch_to_add is a container of tuples (one tuple per edge) in NetworkX. cuGraph should take a dataframe with multiple columns instead (two columns for source and destination).
- cuGraph.Graph.add_weighted_edges_from(ebunch_to_add, weight=’weight’) (following networkX.Graph.add_weighted_edges_from()): similar to add_edges_from(). weight=’weight’ points the additional column to provide edge weights.
- cuGraph.from_cudf_adjacencylist(df, offset=’offset’, index=’index’, weight=None, create_using=None) (following networkx.convert_matrix.from_pandas_adjacency()): NetworkX takes an adjacency matrix. cuGraph should take an adjacency list (offsets + indices + optional edge weight).
- cuGraph.from_cudf_edgelist(df, source=’source’, target=’target’, weight=None, create_using=None) (following networkx.convert_matrix.from_pandas_edgelist).
- Same functions for DiGraph.
Deletion
- cuGraph.Graph.clear() (following networkX.Graph.clear())
- Same functions for DiGraph.
Retrieval
- cuGraph.to_cudf_adjacencylist (following networkx.convert_matrix.to_pandas_adjacency()): Note that this function returns an adjacency list instead of an adjacency matrix.
- cuGraph.to_cudf_edgelist (following networkx.convert_matrix.to_pandas_edgelist()).
- We may add cuGraph.to_pandas_adjacency() and cuGraph.to_pandas_edgelist() in the future, or we may handle this outside cuGraph (use cuDF for conversion outside cuGraph).
Additional functions from NetworkX
- number_of_nodes().
- number_of_edges().
- NetworkX has many more functions to query graphs. cuGraph currently has only few. We need to add these functions over time in future releases.

4. Graph Generation

4.1. R-mat graph generation

NetworkX & cuGraph comparison

NetworkX does not have an R-mat graph generator but has another scale free graph generator with the following syntax: scale_free_graph(n, alpha=0.41, beta=0.54, gamma=0.05, delta_in=0.2, delta_out=0, create_using=None, seed=None). cuGraph’s R-mat graph generator has the following syntax: grmat_gen(argv), where argv is a string of options.

API Proposal

rmat_graph(scale=10, a=0.57, b=0.19, c=0.19, d=0.05, create_using=None, seed=None).

5. Algorithms

5.1. Breadth-first-search

NetworkX & cuGraph comparison

NetworkX: bfs_edges(G, source, reverse=False, depth_limit=None).
cuGraph: bfs(G, start, directed=True).

API Proposal

Follow NetworkX.

5.2. Jaccard

NetworkX & cuGraph comparison

NetworkX: jaccard_coefficient(G, ebunch=None).
cuGraph: jaccard(input_graph, first=None, second=None), In addition, cuGraph supports weighted Jaccard as well: jaccard_w(input_graph, weights, first=None, second=None).

Design Issue

Should we create another function name (e.g. weighted_jaccard_coefficient) for weighted Jaccard or use the same function name with an optional input parameter weight=’weight’?

API Proposal

Follow NetworkX for Jaccard coefficient. For weighted Jaccard coefficient, add weight=’weight’ to the input parameter list.

5.3. Louvain

NetworkX & cuGraph comparison

NetworkX: Supported in NetworkX via the python-louvain community (https://python-louvain.readthedocs.io/en/latest/api.html): best_partition(G, partition=None, weight='weight', resolution=1.0, randomize=None, random_state=None).
cuGraph: nvLouvain(input_graph).

Design Issues

Should we expose optional parameters to users?

API Proposal

Just rename nvLouvain to louvain.
Add a second API with advanced parameters.

5.4. Overlap Coefficient

NetworkX & cuGraph comparison

NetworkX: N/A but we may refer to jaccard_coefficient.
cuGraph: overlap(input_graph, first=None, second=None) overlap_w(input_graph, weights, first=None, second=None).

Design Issue

Similar to the Jaccard case, should we create another function name (e.g. weighted_overlap_coefficient) for weighted overlap coefficients or use the same function name with an optional input parameter weight=’weight’?

API Proposal

Follow NetworkX for overlap coefficient(overlap_coefficient(G, ebunch=None)). For weighted overlap coefficient, add weight=’weight’ to the input parameter list.

5.5. Pagerank

NetworkX & cuGraph comparison

NetworkX: pagerank(G, alpha=0.85, personalization=None, max_iter=100, tol=1e-06, nstart=None, weight=’weight’, dangling=None).
cuGraph: pagerank(G, alpha=0.85, max_iter=100, tol=1.0e-5).

API Proposal

Follow NetworkX.

5.6. Spectral Clustering

NetworkX & cuGraph comparison

NetwokrX: N/A for spectral clustering. modularity(G, communities, weight=’weight’) to compute modularity.
SKLearn: https://scikit-learn.org/stable/modules/generated/sklearn.cluster.spectral_clustering.html
cuGraph: spectralBalancedCutClustering(G, num_clusters, num_eigen_vects=2, evs_tolerance=.00001, evs_max_iter=100, kmean_tolerance=.00001, kmean_max_iter=100), spectralModularityMaximizationClustering(G, num_clusters, num_eigen_vects=2, evs_tolerance=.00001, evs_max_iter=100, kmean_tolerance=.00001, kmean_max_iter=100), analyzeClustering_modularity(G, n_clusters, clustering), analyzeClustering_edge_cut(G, n_clusters, clustering)

Design Issues

Should we follow SKLearn?

API Proposal

Follow SKLearn for spectral clustering.

5.7. Single-source-shortest-path

NetworkX & cuGraph comparison

NetworkX: shortest_path(G, source=None, target=None, weight=None, method=’dijkstra’) or single_source_shortest_path(G, source, cutoff=None).
cuGraph: sssp(G, source)

API Proposal

Follow NetworkX. Throw exceptions for the optional input parameters cuGraph currently does not support.

6. References

[NetworkX] Aric Hagberg, Dan Schult, Pieter Swart, NetworkX Reference Release 2.3rc2.dev20190331021853, March 31st, 2019.

@seunghwak amazing work here! Thanks for posting back.

Looking through the design document, I have the following comments:

Multigraph support.

I think it is reasonable to keep multigraph support on a "later in the roadmap" case. The most common graph use cases are single edges between nodes, as well as distinguishing between directed and undirected.

We store only the attributes directly used by cuGraph algorithms inside cuGraph objects; cuGraph objects can store the node attribute “bipartite” and the edge attribute “weight” but other attributes are stored outside cuGraph objects.

Perhaps taking advantage of the underlying columnar data structure could be helpful? A common use case for graphs is to store the same kind of data under each key for each node, and do a similar thing for each edge. This is equivalent to keeping a node table (rather than a node list) and an edge table (rather than an edge list). Both of these can be cast as data frames; an example I use is the Divvy bike sharing dataset in this notebook on NAMS.

Graphs are initialized with edge data. How should we support the node attribute “bipartite”?

Also suggest using the aforementioned data structure?

Also thought a bit about how API access can be done later - it's common for NetworkX users to query for nodes/edges with a certain attribute value directly. These are equivalent to dataframe filtering operations.

Should we expose multiple formats (CSR, CSC, COO) to users? For example, cuGraph currently has delete_edge_list(), delete_adj_list(), and delete_transposed_adj_list(). NetworkX has only a single clear() function.

Personal opinion, but I think simplicity would be better in this case. I don't know how to articulate why, it's just a gut feeling.

We may consider supporting a list of tuples and NetworkX graph objects in the future to improve NetworkX compatibility (assuming that cuDF provides an efficient mechanism to convert a list of tuples or python dictionaries to a columnar GPU DataFrame).

That would be amazing! That's the most idiomatic way of interacting with NetworkX graphs at the moment. The next most idiomatic way is to use dataframes, which NetworkX also supports, btw.

rmat_graph(scale=10, a=0.57, b=0.19, c=0.19, d=0.05, create_using=None, seed=None)

Looks idiomatic to me, as the suffix _graph is in-line with how nx graph generators are named.

Just rename nvLouvain to louvain. Add a second API with advanced parameters.

Perhaps just adding the advanced parameters with sane defaults would be okay? Python users are generally ok with that.

Follow SKLearn for spectral clustering.

100% agree.

I guess overall, I see it as being awesome that you guys are designing against NetworkX's API!

In terms of the underlying data structure, I can see how an internal change to using cuDFs would be a great advantage for interoperability with the rest of the RAPIDS ecosystem. I guess in terms of user-facing design, ensuring that the appropriate connectors are present to make it usable with the NetworkX API would be very helpful.

My biggest thought on how to keep things simple for you all is to use a cuDF each for the node and edge tables, and then building the connectors necessary to ask for graph data. Multi-partite graphs can be represented using a column called "partition" (for the general case) and "bipartite" (for the case of being interoperable with NetworkX). (Side note: bipartite is definitely more commonly used right now.)

Thanks again for keeping us updated, @seunghwak!

To properly support Graph (undirected) and DiGraph (directed) in NetworkX, https://github.com/rapidsai/cugraph/issues/316 and https://github.com/rapidsai/cugraph/issues/317 need to be addressed first (and cugraph's C++ gdf_graph (or cugraph::graph following cudf's ongoing transition from gdf_column to cudf::column) needs to be updated to explicitly handle both undirected and directed graphs). Once we use different classes for undirected and directed graphs following NetworkX, https://github.com/rapidsai/cugraph/issues/318 can be easily solved.

So inconsistencies. In the python level, add_edge_list and add_adj_list initialize a graph while add_transposed_adj_list performs conversion. In CUDA/C++, gdfadd..._list performs conversion but add_edge_list and add_adjlist perform initialization in python. In CUDA/C++, gdf..._list_view functions perform initialization. Calling both add_edge_list and add_adj_list can result in a single graph instance storing two different graphs (issue #187). And we have num_vertices, why no num_edges?

This has been addressed. With graph class changes, we now have a check https://github.com/rapidsai/cugraph/blob/5fc8b658c42b581ffc25a78447bb89c1f761357d/python/cugraph/structure/graph.py#L328 to raise exception to avoid "a single graph storing two different graphs". Also we have both number_of_vertices and number_of_edges graph functions.

And the surprise part. view_edge_list, view_adj_list, and num_vertices implicitly perform conversion. This requires more thoughts as there is a design trade-off. Implicit (or automatic) conversion allows users to type less. But this can require a significant amount of computing and also more memory for large graphs. It can be quite surprising for users to see memory allocation error after long wait when they simply query the number of vertices.

Number of vertices is executed completely in python. It stores node_Count so this type of conversion only happens once for a graph, and it later calls, this node_count value is retrieved. Viewing also performs implicit conversions for convenience and networkx consistency. However even these conversions would happen only once for a Graph and is user-friendly for situations where user inputs edgelist and wants to simply get/generate an adjlist. For benchmarking/timing these coo2csr conversions can be explicitly done before to avoid including these overheads in the overall performance analysis.

Option 1) initialize in instance creation (similar to cudf dataframe or series). e.g. G = cugraph.Graph(col_0, col_1, col_2, format) # col_0 and col_1 do not sound like good names but col_0 and col_1 can have different meanings in COO and CSR

Option 2) Initialize using view e.g. G = cugraph.view_edge_list(source_col, dest_col, value_col) # isn't weight_col a better name than value_col? and we should we use edge_list, adj_list, transposed_adj_list or COO, CSR, and CSC?

In NetworkX, the following API calls are used: G.add_edges_from(...) # add a set of edges to the graph G.add_edge(...) # add an edge to the graph G.add_nodes_from(...) # add a set of nodes to the graph G.add_node(...) # add a node In order to make cuGraph a drop-in replacement for NetworkX, at the very minimum, those same class methods should be present.

We have this API changed quite a bit. It now follows networkx from_pandas_edgelist. We can either use 1) G = cugraph.from_cudf_edgelist(df, source='src', destination='dest', edge_attr='weights', create_using = cugraph.Graph) 2) G = cugraph.Graph() G.from_cudf_edgelist(df, source='src', destination='dest', edge_attr='weights')

3. Class Graph

NetworkX & cuGraph comparison

NetworkX has separate classes for different graph types: Graph for undirected graphs, DiGraph for directed graphs, MultiGraph for undirected multigraphs, and MultiDiGraph for directed multigraphs. cuGraph has a single Graph class.

Addressed. Cugraph now has separate Graphs.

NetworkX supports varying numbers of attributes for graphs, nodes, and edges. cuGraph supports only edge weights (we plan to add the node attribute “bipartite”).

Design Issues

We store only the attributes directly used by cuGraph algorithms inside cuGraph objects; cuGraph objects can store the node attribute “bipartite” and the edge attribute “weight” but other attributes are stored outside cuGraph objects.

How should we support graph updates? 1) Batch updates; 2) update the underlying data structure for efficient dynamic graph updates; 3) or disallow graph updates (in this case, users need to update graphs outside cuGraph and explicitly initialize again). In short term, we plan to support graph construction on initialization and batch updates only.

Multigraph support.

Graphs are initialized with edge data. How should we support the node attribute “bipartite”?

bipartite and multigraphs with more than one edge_attr is in progress.

API Proposal

Initialization

cugraph.Graph.init(incoming_graph_data=None, weight=’weight’) (following networkX.Graph.init()): incoming_graph_data is an edge list or any cuGraph graph object. NetworkX takes a list of tuples as an edge list. cuGraph should take a dataframe with multiple columns instead (two columns for sources and destinations, an optional column for edge weights). We may consider supporting a list of tuples and NetworkX graph objects in the future to improve NetworkX compatibility (assuming that cuDF provides an efficient mechanism to convert a list of tuples or python dictionaries to a columnar GPU DataFrame).

cuGraph.Graph.add_edges_from(ebunch_to_add) (following networkX.Graph.add_edges_from()): ebunch_to_add is a container of tuples (one tuple per edge) in NetworkX. cuGraph should take a dataframe with multiple columns instead (two columns for source and destination).

cuGraph.Graph.add_weighted_edges_from(ebunch_to_add, weight=’weight’) (following networkX.Graph.add_weighted_edges_from()): similar to add_edges_from(). weight=’weight’ points the additional column to provide edge weights.

cuGraph.from_cudf_adjacencylist(df, offset=’offset’, index=’index’, weight=None, create_using=None) (following networkx.convert_matrix.from_pandas_adjacency()): NetworkX takes an adjacency matrix. cuGraph should take an adjacency list (offsets + indices + optional edge weight).

cuGraph.from_cudf_edgelist(df, source=’source’, target=’target’, weight=None, create_using=None) (following networkx.convert_matrix.from_pandas_edgelist).

Same functions for DiGraph.

We support the cuGraph.from_cudf_edgelist() functionality as the way of adding edges, consistent with networkx's nx.from_pandas_edgelist()

Retrieval

cuGraph.to_cudf_adjacencylist (following networkx.convert_matrix.to_pandas_adjacency()): Note that this function returns an adjacency list instead of an adjacency matrix.

cuGraph.to_cudf_edgelist (following networkx.convert_matrix.to_pandas_edgelist()).

We may add cuGraph.to_pandas_adjacency() and cuGraph.to_pandas_edgelist() in the future, or we may handle this outside cuGraph (use cuDF for conversion outside cuGraph).

Retrieval is done using view functions

Additional functions from NetworkX

number_of_nodes().

number_of_edges().

NetworkX has many more functions to query graphs. cuGraph currently has only few. We need to add these functions over time in future releases.

Many functions have been added to Graph including number_of_nodes() and number_of_edges(). PR #822 adds many of the networkx graph functions.

We are doing a review of where cuGraph is related to matching NetworkX. Identified discrepancies will be added as new issues

rapidsai / cugraph

[ENH] class Graph API (python) design needs improvements #192

cuGRAPH adding an API Layer to match NetworkX

Table of Contents

1. Design Objectives

2. Major Challenges and Design Decisions

3. Class Graph

4. Graph Generation

4.1. R-mat graph generation

5. Algorithms

5.1. Breadth-first-search

5.2. Jaccard

5.3. Louvain

5.4. Overlap Coefficient

5.5. Pagerank

5.6. Spectral Clustering

5.7. Single-source-shortest-path

6. References

3. Class Graph