This repository contains the code for the reproducibility of the experiments presented in the paper "Filling the G_ap_s: Multivariate Time Series Imputation by Graph Neural Networks" (ICLR 2022). In this paper, we propose a graph neural network architecture for multivariate time series imputation and achieve state-of-the-art results on several benchmarks.
Authors: Andrea Cini, Ivan Marisca, Cesare Alippi
‼️ PyG implementation of GRIN is now available inside Torch Spatiotemporal, a library built to accelerate research on neural spatiotemporal data processing methods, with a focus on Graph Neural Networks.
The paper introduces GRIN, a method and an architecture to exploit relational inductive biases to reconstruct missing values in multivariate time series coming from sensor networks. GRIN features a bidirectional recurrent GNN which learns spatio-temporal node-level representations tailored to reconstruct observations at neighboring nodes.
The directory is structured as follows:
.
├── config
│ ├── bimpgru
│ ├── brits
│ ├── grin
│ ├── mpgru
│ ├── rgain
│ └── var
├── datasets
│ ├── air_quality
│ ├── metr_la
│ ├── pems_bay
│ └── synthetic
├── lib
│ ├── __init__.py
│ ├── data
│ ├── datasets
│ ├── fillers
│ ├── nn
│ └── utils
├── requirements.txt
└── scripts
├── run_baselines.py
├── run_imputation.py
└── run_synthetic.py
Note that, given the size of the files, the datasets are not readily available in the folder. See the next section for the downloading instructions.
All the datasets used in the experiment, except CER-E, are open and can be downloaded from this link. The CER-E dataset can be obtained free of charge for research purposes following the instructions at this link. We recommend storing the downloaded datasets in a folder named datasets
inside this directory.
The config
directory stores all the configuration files used to run the experiment. They are divided into folders, according to the model.
The support code, including the models and the datasets readers, are packed in a python library named lib
. Should you have to change the paths to the datasets location, you have to edit the __init__.py
file of the library.
The scripts used for the experiment in the paper are in the scripts
folder.
run_baselines.py
is used to compute the metrics for the MEAN
, KNN
, MF
and MICE
imputation methods. An example of usage is
python ./scripts/run_baselines.py --datasets air36 air --imputers mean knn --k 10 --in-sample True --n-runs 5
run_imputation.py
is used to compute the metrics for the deep imputation methods. An example of usage is
python ./scripts/run_imputation.py --config config/grin/air36.yaml --in-sample False
run_synthetic.py
is used for the experiments on the synthetic datasets. An example of usage is
python ./scripts/run_synthetic.py --config config/grin/synthetic.yaml --static-adj False
We run all the experiments in python 3.8
, see requirements.txt
for the list of pip
dependencies.
If you find this code useful please consider to cite our paper:
@inproceedings{cini2022filling,
title={Filling the G\_ap\_s: Multivariate Time Series Imputation by Graph Neural Networks},
author={Andrea Cini and Ivan Marisca and Cesare Alippi},
booktitle={International Conference on Learning Representations},
year={2022},
url={https://openreview.net/forum?id=kOu3-S3wJ7}
}