peaclab / CoMTE

Counterfactual Explanations for Multivariate Time Series Data
BSD 3-Clause "New" or "Revised" License
29 stars 11 forks source link

CoMTE

Counterfactual Explanations for Multivariate Time Series Data

CoMTE is a novel counterfactual multivariate time series explainability method that provides explanations for individual predictions. The counterfactual explanations consist of hypothetical samples that are as similar as possible to the sample that is explained, while having a different classification label; i.e., "if the values of these particular time series were different in the given sample, the classification label would have been different."

Counterfactual explanations are generated by selecting time series from the training set and substituting them in the sample under investigation to obtain different classification results. In this way, end users can understand the classification decision by examining a limited number of variables.

Maintainer:

Developers:

Requirements

We provide NATOPS dataset in the repo and you can start using CoMTE without downloading the other data sets.

In case you want to see more examples, we use HPC data sets in our paper and these are located on Zenodo. After you click download, it will send a request to the owner, and after the approval, you will receive a link to download HPC data sets.

A python 3.x installation is required, as well as the packages inside requirements.txt and the fast_features package.

pip3 install --user -r requirements.txt

Instructions for fast_features package are inside the fast_features directory.

Usage

The code assumes that the data is located at ./data

import explainers
comte = explainers.OptimizedSearch(pipeline, timeseries, labels, silent=False, threads=1)

#Call explain method with a sample from timeseries
comte.explain(test_timeseries.loc[['5c15428439747d4a8fa8f85d_60'], :, :], to_maximize=5, savefig=False)

Known Issues

Authors

ICAPAI'21: Counterfactual Explanations for Multivariate Time Series

ArXiv: Counterfactual Explanations for Machine Learning on Multivariate Time Series Data

Authors: Emre Ates (1), Burak Aksar (1), Vitus J. Leung (2), Ayse K. Coskun (1)

Affiliations: (1) Department of Electrical and Computer Engineering, Boston University (2) Sandia National Laboratories

This work has been partially funded by Sandia National Laboratories. Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under Contract DENA0003525.

License

This project is licensed under the BSD 3-Clause License - see the LICENSE file for details