This repository contains the official PyTorch implementation of
by Jiaxin Shi, Ke Alexander Wang, Emily B. Fox
Paper: [abstract] [pdf]
TL;DR: We introduce a new SOTA convolutional sequence modeling layer that is simple to implement (15 lines of PyTorch code using standard convolution and linear operators) and requires at most O(N log N) time and memory.
The key component of the layer is a multiresolution convolution operation (MultiresConv
, left in the figure) that mimics the computational structure of wavelet-based multiresolution analysis.
We use it to build a memory ($\mathbf{z}_n$ in the figure) for long context modeling which captures multiscale trends of the data.
Our layer is simple (it's linear) and parameter efficient (it uses depthwise convolutions; filters are shared across timescales), making it easy to intergrate with modern architectures such as gated activations, residual blocks, and normalizations.
pip install -r requirements.txt
For Long ListOps and PTB-XL experiments, please follow the comments in dataloaders
to download and prepare the dataset.
We provide multi-GPU training code for all experiments. For example,
bash scripts/seq_cifar.sh
will run the sequential CIFAR-10 classification experiment with 2 GPUs using the settings in the paper.
The main file for classification experiments are classification.py
.
The autoregressive generative modeling training and evaluation code are in autoregressive.py
and autogressive_eval.py
.
If you find this code useful, please cite our work:
@inproceedings{shi2023sequence,
title={Sequence Modeling with Multiresolution Convolutional Memory},
author={Shi, Jiaxin and Wang, Ke Alexander and Fox, Emily B.},
booktitle={International Conference on Machine Learning},
year={2023}
}