CurryTang / Graph-LLM

Exploring the Potential of Large Language Models (LLMs) in Learning on Graphs
243 stars 25 forks source link

There are duplicate edges in cora dataset using provided the processed datasets! #12

Closed zhongjian-zhang closed 8 months ago

zhongjian-zhang commented 8 months ago

Hi, I find that there are duplicate edges in cora dataset when derectly using google drive data! I first use following code to transfer data.edge_index to csr_matrix:

def edge_index_to_csr_matrix(edge_index): rows, cols = edge_index data = [1] * len(rows) return csr_matrix((data, (rows, cols)))

then find that max value of csr_matrix equal 2, so there are some duplicate edges?

CurryTang commented 8 months ago

Please check https://github.com/CurryTang/Graph-LLM/issues/7