EEND (End-to-End Neural Diarization) is a neural-network-based speaker diarization method.
The EEND extension for various number of speakers is also provided in this repository.
cd tools
make
tools/kaldi
cd tools
make KALDI=<existing_kaldi_root>
This option make a symlink at tools/kaldi
tools/miniconda3
, and creates conda envirionment named 'eend'/usr/local/cuda/
cd tools
make CUDA_PATH=/your/path/to/cuda-8.0
This command installs cupy-cudaXX according to your CUDA version. See https://docs-cupy.chainer.org/en/stable/install.html#install-cupy
egs/mini_librispeech/v1/cmd.sh
according to your job schedular.
If you use your local machine, use "run.pl".
If you use Grid Engine, use "queue.pl"
If you use SLURM, use "slurm.pl".
For more information about cmd.sh see http://kaldi-asr.org/doc/queue.html.
cd egs/mini_librispeech/v1
./run_prepare_shared.sh
./run.sh
run.sh
to use config/eda/{train,infer}.yaml
RESULT.md
and compare with your result.egs/callhome/v1/cmd.sh
according to your job schedular.
If you use your local machine, use "run.pl".
If you use Grid Engine, use "queue.pl"
If you use SLURM, use "slurm.pl".
For more information about cmd.sh see http://kaldi-asr.org/doc/queue.html.egs/callhome/v1/run_prepare_shared.sh
according to storage paths of your corpora.cd egs/callhome/v1
./run_prepare_shared.sh
# If you want to conduct 1-4 speaker experiments, run below.
# You also have to set paths to your corpora properly.
./run_prepare_shared_eda.sh
./run.sh
local/run_blstm.sh
./run_eda.sh
[1] Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Kenji Nagamatsu, Shinji Watanabe, " End-to-End Neural Speaker Diarization with Permutation-free Objectives," Proc. Interspeech, pp. 4300-4304, 2019
[2] Yusuke Fujita, Naoyuki Kanda, Shota Horiguchi, Yawen Xue, Kenji Nagamatsu, Shinji Watanabe, " End-to-End Neural Speaker Diarization with Self-attention," Proc. ASRU, pp. 296-303, 2019
[3] Shota Horiguchi, Yusuke Fujita, Shinji Watanabe, Yawen Xue, Kenji Nagamatsu, " End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors," Proc. INTERSPEECH, 2020
@inproceedings{Fujita2019Interspeech,
author={Yusuke Fujita and Naoyuki Kanda and Shota Horiguchi and Kenji Nagamatsu and Shinji Watanabe},
title={{End-to-End Neural Speaker Diarization with Permutation-free Objectives}},
booktitle={Interspeech},
pages={4300--4304}
year=2019
}