BingYang-20 / SAR-SSL

A python implementation of “Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Multi-Channel Conformer”
MIT License
15 stars 1 forks source link

SAR-SSL

A python implementation of “Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Multi-Channel Conformer

Datasets

Quick start

Data generation

1. Download datasets to folders according to the following dictionary

  .-SAR-SSL
  | .-code
  | .-data
  | .-exp
  .-data
    .-SrcSig
    | .-wsj0
    |   .-dt
    |   .-et
    |   .-tr
    .-RIR
    | .-Mesh
    | | .-S32-M441_npy
    | .-MIRDB
    | | .-Impulse_response_Acoustic_Lab_Bar-Ilan_University
    | .-DCASE
    | | .-TAU-SRIR_DB
    | | .-TAU-SNoise_DB
    | .-dEchorate
    | | .-dEchorate_database.csv
    | | .-dEchorate_rir.h5
    | | .-dEchorate_annotations.h5
    | | .-dEchorate_noise_gzip7.hdf5
    | | .-dEchorate_babble_gzip7.hdf5
    | | .-dEchorate_silence_gzip7.hdf5
    | .-BUTReverb
    | | .-RIRs
    | .-ACE
    |   .-RIRN
    |   .-Data
    .-MicSig
      .-LOCATA
        .-dev
        .-eval
      .- MC_WSJ_AV
      .- LibriCSS
      .- AMIMeeting
      .- AISHELL-4
      .- AliMeeting
      .- RealMAN

2. Generate room impulse responses or microphone signals

2. Training

Others

If OSError: [Errno 24] Too many open files occurs, input the following at the command line

  ulimit -n 2048

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{yang2023sarssl,
    author = "Bing Yang and Xiaofei Li",
    title = "Self-Supervised Learning of Spatial Acoustic Representation with Cross-Channel Signal Reconstruction and Multi-Channel Conformer",
    booktitle = "arXiv preprint arXiv:2312.00476",
    year = "2023",
    pages = ""}

Licence

MIT