xgi-org / xgi-data

Standardized higher-order datasets with corresponding datasheets
https://zenodo.org/communities/xgi
Other
5 stars 2 forks source link
datasheet hypergraph json

XGI-DATA

This is a repository of openly available hypergraph datasets in JSON format with documentation more extensively describing the datasets. They are hosted in the XGI Community on Zenodo and a table of statistics can be found on Read The Docs. There is also a rudimentary inspection script for checking that datasets are in the proper format. This is loosely inspired by Datasheets for Datasets by Gebru et al.

Overview of the xgi-data format

The xgi-data format for hypergraph data sets is a JSON data structure with the following structure:

All IDs are strings but can be converted to other types if desired.

Data sets available on xgi-data

Currently available data sets are:

These datasets can be loaded with xgi using the following lines:

import xgi
H = xgi.load_xgi_data("<dataset_name>")

where <dataset_name> is chosen from the list above.

These datasets have been taken from the following sources:

Repository Description

index.json is a dictionary of the data sets that are currently available on xgi-data and the url where they are hosted. The code folder contains the scripts used to convert hypergraph datasets into a more standard format and the JSON inspection script. This code can be adapted to convert data sets that are currently not part of xgi-data into xgi-data format.

Checking dataset format

To check if a file has the xgi-data format, run the following command:

python inspect_json.py filepath.json

Funding

The XGI-DATA package has been supported by NSF Grant 2121905, "HNDS-I: Using Hypergraphs to Study Spreading Processes in Complex Social Networks".