Open acezen opened 1 year ago
LDBC SNB/BI dataset
What about Stanford Graph Dataset? There are a lot of network of different types from Kb to Gb. And they are already splitted into tasks, like community detection, graph classification, etc.
What about Stanford Graph Dataset? There are a lot of network of different types from Kb to Gb. And they are already splitted into tasks, like community detection, graph classification, etc.
That would be a great data source for GraphAr, thanks for the proposal!
We could consider utilizing the following graph datasets for our proposal:
Property Graphs: The LDBC graphs feature a variety of vertex and edge types, each with associated properties that encompass diverse data types. These graphs can be generated at various scales to accommodate different analysis needs.
Simple Topological Graphs: The SNAP datasets offer a collection of real-world graphs from multiple domains, including social networks, web graphs, and road networks, among others. Additionally, the Laboratory for Web Algorithmics provides a range of large-scale web graphs compressed using LLP + WebGraph.These can be particularly useful for evaluating the storage efficiency of GraphAr.
Labeled Property Graphs: The neo4j-graph-examples repository contains graphs in Neo4j dump format, characterized by the inclusion of vertex labels. Each vertex in these graphs may have multiple associated labels, adding complexity to the graph properties.
GNN Graphs The OGBN graphs are tailored for node property prediction tasks, with the predicted labels being represented as vertex labels. These graphs are well-suited for representing GNN (Graph Neural Networks) graph structures.
Subsequent considerations may encompass the use of RDF (Resource Description Framework) datasets, temporal graphs, and knowledge graphs.
@acezen, do you have any more comments on this proposal?
@acezen, do you have any more comments on this proposal?
Looks good to me
For improving the ability of GraphAr format, we prepare to construct a data hub with GraphAr format.
This issue is for gathering graph dataset, which is best to meet the following requirements:
Any comments, questions, and dataset suggestions are welcome!