zhaojw1998 / Beat-Transformer

Codes for ISMIR 2022 paper: Beat Transformer: Demixed Beat and Downbeat Tracking with Dilated Self-Attention
MIT License
90 stars 17 forks source link

Beat Transformer

Open In Colab

Repository for paper: Beat Transformer: Demixed Beat and Downbeat Tracking with Dilated Self-Attention in Proceedings of the 23rd International Society for Music Information Retrieval Conference (ISMIR 2022), Bengaluru, India.

Welcome to test our model on your own music at our Google Colab.

Code and File Directory

This repository is organized as follows:

root
│
└───checkpoint                          PyTorch model checkpoints
    │   ···
│   
└───code
    └───ablation_models                 ablation models
        │   ···                            
    │   DilatedTransformer.py           Beat Transformer model
    │   DilatedTransformerLayer.py      Dilated Self-Attention
    │   spectrogram_dataset.py          data loader
    │   train.py                        training script
    │   ...                             code for other utilities
│   
└───data
    └───audio_lists                     Order info of pieces in each dataset
        │   ···                     
    │   demix_spectrogram_data.npz      demixed spectrogram data (33GB, to be downloaded)
    │   full_beat_annotation.npz        beat/downbeat annotation
│   
└───preprocessing                       code for data pre-processing
    │   ···
│   
└───save                                training log and more
    │   ···

How to run

Audio Data

We use a total of 7 datasets for model training and testing. If you wish to acquire the audio data, you can follow the following guidelines:

For the beat/downbeat annotation of Ballroom, GTZAN, SMC, and Hainsworth, I used the annotation released by Sebastian Böck here.

Contact

Jingwei Zhao (PhD student in Data Science at NUS)

jzhao@u.nus.edu

Nov. 24, 2022