rapidsai / cugraph

cuGraph - RAPIDS Graph Analytics Library
https://docs.rapids.ai/api/cugraph/stable/
Apache License 2.0
1.77k stars 304 forks source link

[QST]: `cugraph.induced_subgraph()` not recognizing graph argument #4472

Open stephwon opened 5 months ago

stephwon commented 5 months ago

What is your question?

Despite the graph (G) being in cugraph.Graph format, it's giving me the TypeError: Argument 'graph' has incorrect type (expected pylibcugraph.graphs._GPUGraph, got dict) message. The bolded portion in my code indicates that my graph is in cugraph structure, I'm unsure why it's failing to recognize the cuGraph format of `G'. How can I address this error to extract the subgraph properly?

Here is my full code:

columns_to_read = ['subject', 'object', 'predicate']

# Read the TSV file using Dask cuDF with specific columns
df_graph = dask_cudf.read_csv('graph.tsv', sep='\t', 
                        usecols=columns_to_read) #directed graph

# Renaming columns to match cuGraph requirements
df_graph = df_graph.rename(columns={'subject': 'source', 'object': 'destination'})

# Create graph from input data
G = cugraph.Graph(directed=False) # load directed graph as undirected
G.from_dask_cudf_edgelist(df_graph, source = 'source', destination = 'destination') # Number of edges in the graph: 37134918

**# Check to make sure G is in cugraph.Graph
type(G)
Output: 
cugraph.structure.graph_classes.Graph**

# Create a subgraph containing only the nodes of interest and their edges

nodes_of_interest = ['PUBCHEM.COMPOUND:33741', 'NCBIGene:4988', 'NCBIGene:111', 'NCBIGene:3767', 'GO:0070509', 'GO:1990793', 'GO:0061535', 
                     'GO:0099610', 'CL:0000198', 'NCBIGene:6530', 'GO:0051620', 'PUBCHEM.COMPOUND:439260', 'GO:0061533', 'NCBIGene:6532', 
                     'GO:0051610', 'PUBCHEM.COMPOUND:5202', 'GO:0060096', 'GO:0019233', 'HP:0012531', 'MONDO:0005178']  # Example nodes of interest

svert = cudf.Series(nodes_of_interest)
subgraph = cugraph.induced_subgraph(G, svert)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[8], line 8
      3 nodes_of_interest = ['PUBCHEM.COMPOUND:33741', 'NCBIGene:4988', 'NCBIGene:111', 'NCBIGene:3767', 'GO:0070509', 'GO:1990793', 'GO:0061535', 
      4                      'GO:0099610', 'CL:0000198', 'NCBIGene:6530', 'GO:0051620', 'PUBCHEM.COMPOUND:439260', 'GO:0061533', 'NCBIGene:6532', 
      5                      'GO:0051610', 'PUBCHEM.COMPOUND:5202', 'GO:0060096', 'GO:0019233', 'HP:0012531', 'MONDO:0005178']  # Example nodes of interest
      7 svert = cudf.Series(nodes_of_interest)
----> 8 subgraph = cugraph.induced_subgraph(G, svert)

File /anaconda3/envs/rapids-24.04/lib/python3.11/site-packages/cugraph/community/induced_subgraph.py:128, in induced_subgraph(G, vertices, offsets)
    125 result_graph = Graph(directed=directed)
    127 do_expensive_check = False
--> 128 source, destination, weight, offsets = pylibcugraph_induced_subgraph(
    129     resource_handle=ResourceHandle(),
    130     graph=G._plc_graph,
    131     subgraph_vertices=vertices,
    132     subgraph_offsets=offsets,
    133     do_expensive_check=do_expensive_check,
    134 )
    136 df = cudf.DataFrame()
    137 df["src"] = source

TypeError: Argument 'graph' has incorrect type (expected pylibcugraph.graphs._GPUGraph, got dict)

Code of Conduct

alexbarghi-nv commented 5 months ago

@pacificoceanmist you want cugraph.dask.induced_subgraph since this is a multi-GPU graph.

nv-rliu commented 4 months ago

Try doing

import cugraph.dask as dcg

subgraph = dcg.induced_subgraph(G, svert)