Namkyeong / AFGRL

The official source code for "Augmentation-Free Self-Supervised Learning on Graphs"
https://arxiv.org/abs/2112.02472
76 stars 15 forks source link

Augmentation-Free Self-Supervised Learning on Graphs

The official source code for Augmentation-Free Self-Supervised Learning on Graphs paper, accepted at AAAI 2022.

Overview

Inspired by the recent success of self-supervised methods applied on images, self-supervised learning on graph structured data has seen rapid growth especially centered on augmentation-based contrastive methods. However, we argue that without carefully designed augmentation techniques, augmentations on graphs may behave arbitrarily in that the underlying semantics of graphs can drastically change. As a consequence, the performance of existing augmentation-based methods is highly dependent on the choice of augmentation scheme, i.e., hyperparameters associated with augmentations. In this paper, we propose a novel augmentation-free self-supervised learning framework for graphs, named AFGRL. Specifically, we generate an alternative view of a graph by discovering nodes that share the local structural information and the global semantics with the graph. Extensive experiments towards various node-level tasks, i.e., node classification, clustering, and similarity search on various real-world datasets demonstrate the superiority of AFGRL.

Augmentations on images keep the underlying semantics, whereas augmentations on graphs may unexpectedly change the semantics.

Requirements

Hyperparameters

Following Options can be passed to main.py

--dataset: Name of the dataset. Supported names are: wikics, cs, computers, photo, and physics. Default is wikics.
usage example :--dataset wikics

--task: Name of the task. Supported names are: node, clustering, similarity. Default is node.
usage example :--task node

--layers: The number of units of each layer of the GNN. Default is [256]
usage example :--layers 256

--pred_hid: The number of hidden units of predictor. Default is [512]
usage example :--pred_hid 512

--topk: The number of neighbors for nearest neighborhood search. Default is 4.
usage example :--topk 4

--num_centroids: The number of centroids for K-means Clustering . Default is 100.
usage example :--num_centroids 100

--num_kmeans: The number of iterations for K-means Clustering . Default is 5.
usage example :--num_kmeans 5

How to Run

You can run the model with following options

Cite (Bibtex)