Region2vec: Community Detection on Spatial Networks Using Graph Embedding with Node Attributes and Spatial Interactions
Abstract: Community Detection algorithms are used to detect densely connected components in complex networks and reveal underlying relationships among components. As a special type of networks, spatial networks are usually generated by the connections among geographic regions. Identifying the spatial network communities can help reveal the spatial interaction patterns, understand the hidden regional structures and support regional development decision-making. Given the recent development of Graph Convolutional Networks (GCN) and its powerful performance in identifying multi-scale spatial interactions, we proposed an unsupervised GCN-based community detection method "region2vec" on spatial networks. Our method first generates node embeddings for regions that share common attributes and have intense spatial interactions, and then applies clustering algorithms to detect communities based on their embedding similarity and geographic adjacency. Experimental results show that while existing methods trade off either attribute similarities or spatial interactions for one another, "region2vec" maintains a great balance between both and performs the best when one wants to maximize both attribute similarities and spatial interactions within communities.
If you find our code useful for your research, you may cite our paper:
Liang, Y., Zhu, J., Ye, W., and Gao, S.. (2022). Region2vec: Community Detection on Spatial Networks Using Graph Embedding with Node Attributes and Spatial Interactions. In Proceedings of 30th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (ACM SIGSPATIAL 2022), November 1-4, 2022, Seattle, WA, USA. DOI: https://doi.org/10.1145/3557915.3560974
@inproceedings{liang2022regions2vec,
title={Region2vec: Community Detection on Spatial Networks Using Graph Embedding with Node Attributes and Spatial Interactions},
author={Liang, Yunlei and Zhu, Jiawei and Ye, Wen and Gao, Song },
booktitle={Proceedings of 30th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
(ACM SIGSPATIAL 2022), November 1-4, 2022, Seattle, WA, USA},
year={2022},
pages={1--4},
doi={10.1145/3557915.3560974}
}
Region2 uses the following packages with Python 3.7
numpy==1.19.5
pandas==0.24.1
scikit_learn==1.1.2
scipy==1.3.1
torch==1.4.0 (suggest torch>=2.2.0 for security alert)
python train.py
python clustering.py --filename your_filename
Here the 'your_filename' should be replaced with the generated file from step 1.
bash run_clustering.sh
Notes: the final results (e.g., metric values) may vary depends on different platforms and package versions. The current result is obtained using Ubuntu with all pacakge versions in requirements.txt.
The data files used in our method are listed below with detailed descriptions.
Flow_matrix.csv: The visitor flow matrix between Census Tracts in Wisconsin (The spatial flow interaction matrix).
Spatial_matrix.csv: The adjacency matrix generated based on the geographic adjacency relationship.
Spatial_matrix_rook.csv: The adjacency matrix generated based on the geographic adjacency relationship with the rook-type contiguity relationship.
Spatial_distance_matrix.csv: the hop distance calculated based on the spatial adjacency matrix.
flow_reID.csv: the visitor flows with updated IDs of Census Tracts.
feature_matrix_f1.csv: the features of nodes (Census Tracts).
feature_matrix_lwinc.csv: the low income population feature of nodes used for generating the homogeneous scores.
We acknowledge the funding support from the County Health Rankings and Roadmaps program of the University of Wisconsin Population Health Institute, Wisconsin Department of Health Services, and the National Science Foundation funded AI institute [Grant No. 2112606] for Intelligent Cyberinfrastructure with Computational Learning in the Environment (ICICLE). Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the funders.