Documentation | Benchmark Paper | Benchmark Scripts | Survey Paper | Paper Collection | Web Interface
GraphSlim is a PyTorch library for graph reduction. It takes graph of PyG format as input and outputs a reduced graph preserving properties or performance of the original graph.
examples
folder.Please choose from requirements_torch1+.txt (torch 1.\*)
and requirements.txt (torch2.*)
at your convenience.
Please change the cuda version of torch
, torch-geometric
and torch-sparse
in the requirements file according to
your system configuration.
# choose one version from https://data.pyg.org/whl/ based on your environment
pip install torch_scatter torch_sparse -f https://data.pyg.org/whl/torch-${TORCH}+${CUDA}.html
pip install graphslim
python examples/train_coreset.py
python examples/train_coarsen.py
python examples/train_gcond.py
See more examples in Benchmark Scripts.
cd graphslim
python train_all.py -xxx xx
Run python configs.py --help
to get all command line options.
Options:
-D, --dataset TEXT [default: cora]
-G, --gpu_id INTEGER gpu id start from 0, -1 means cpu [default:
0]
--setting [trans|ind] transductive or inductive setting
--split TEXT only support public split now, do not change
it [default: fixed]
--run_reduction INTEGER repeat times of reduction [default: 3]
--run_eval INTEGER repeat times of final evaluations [default:
10]
--run_inter_eval INTEGER repeat times of intermediate evaluations
[default: 5]
--eval_interval INTEGER [default: 100]
-H, --hidden INTEGER [default: 256]
--eval_epochs, --ee INTEGER [default: 300]
--eval_model, --em [GCN|GAT|SGC|APPNP|Cheby|GraphSage|GAT|SGFormer]
[default: GCN]
--condense_model [GCN|GAT|SGC|APPNP|Cheby|GraphSage|GAT]
[default: SGC]
-E, --epochs INTEGER number of reduction epochs [default: 1000]
--lr FLOAT [default: 0.01]
--weight_decay, --wd INTEGER [default: 0]
--pre_norm BOOLEAN pre-normalize features, forced true for
arxiv, flickr and reddit [default: True]
--outer_loop INTEGER [default: 10]
--inner_loop INTEGER [default: 1]
-R, --reduction_rate FLOAT -1 means use representative reduction rate;
reduction rate of training set, defined as
(number of nodes in small graph)/(number of
nodes in original graph) [default: -1.0]
-S, --seed INTEGER Random seed [default: 1]
--nlayers INTEGER number of GNN layers of condensed model
[default: 2]
-V, --verbose
--init [variation_neighborhoods|variation_edges|variation_cliques|heavy_edge|algebraic_JC|affinity_GS|kron|vng|clustering|averaging|cent_d|cent_p|kcenter|herding|random]
features initialization methods
-M, --method [variation_neighborhoods|variation_edges|variation_cliques|heavy_edge|algebraic_JC|affinity_GS|kron|vng|clustering|averaging|gcond|doscond|gcondx|doscondx|sfgc|msgc|disco|sgdd|gcsntk|geom|cent_d|cent_p|kcenter|herding|random]
[default: kcenter]
--activation [sigmoid|tanh|relu|linear|softplus|leakyrelu|relu6|elu]
activation function when do NAS [default:
relu]
-A, --attack [random_adj|metattack|random_feat]
corruption method
-P, --ptb_r FLOAT perturbation rate for corruptions [default:
0.25]
--aggpreprocess use aggregation for coreset methods
--dis_metric TEXT distance metric for all condensation
methods,ours means metric used in GCond
paper [default: ours]
--lr_adj FLOAT [default: 0.0001]
--lr_feat FLOAT [default: 0.0001]
--threshold INTEGER sparsificaiton threshold before evaluation
[default: 0]
--dropout FLOAT [default: 0.0]
--ntrans INTEGER number of transformations in SGC and APPNP
[default: 1]
--with_bn
--no_buff skip the buffer generation and use existing
in geom,sfgc
--batch_adj INTEGER batch size for msgc [default: 1]
--alpha FLOAT for appnp [default: 0.1]
--mx_size INTEGER for gcsntk methods, avoid SVD error
[default: 100]
--save_path, --sp TEXT save path for synthetic graph [default:
../checkpoints]
-W, --eval_whole if run on whole graph
--help Show this message and exit.
from graphslim.dataset import *
from graphslim.evaluation import *
from graphslim.condensation import GCond
from graphslim.config import cli
args = cli(standalone_mode=False)
# customize args here
args.reduction_rate = 0.5
args.device = 'cuda:0'
# add more args.<main_args/dataset_args> here
graph = get_dataset('cora', args=args)
# To reproduce the benchmark, use our args and graph class
# To use your own args and graph format, please ensure the args and graph class has the required attributes
# create an agent of one reduction algorithm
# add more args.<agent_args> here
agent = GCond(setting='trans', data=graph, args=args)
# reduce the graph
reduced_graph = agent.reduce(graph, verbose=True)
# create an evaluator
# add more args.<evaluator_args> here
evaluator = Evaluator(args)
# evaluate the reduced graph on a GNN model
res_mean, res_std = evaluator.evaluate(reduced_graph, model_type='GCN')
All parameters can be divided into
<main_args>: dataset, method, setting, reduction_rate, seed, aggpreprocess, eval_whole, run_reduction
<attack_args>: attack, ptb_r
<dataset_args>: pre_norm, save_path, split, threshold
<agent_args>: init, eval_interval, eval_epochs, eval_model, condense_model, epochs, lr, weight_decay, outer_loop, inner_loop, nlayers, method, activation, dropout, ntrans, with_bn, no_buff, batch_adj, alpha, mx_size, dis_metric, lr_adj, lr_feat
<evaluator_args>: final_eval_model, eval_epochs, lr, weight_decay
sparsification
or coarsening
or condensation
and inherit the Base
class.dataset/loader.py
and inherit the TransAndInd
class.evaluation/eval_agent.py
.models
and inherit the Base
class.sparsify
in evaluation/utils.py
.Our web application is deployed online using streamlit. But it also can be initiated using:
cd interface
python -m streamlit run vis_graphslim.py
to activate the interface. Please satisfy the dependency in interface/requirements.txt.
Some of the algorithms are referred to paper authors' implementations and other packages.