There is a problem with this module “pyannote-audio emb train ”

Describe the bug I can't run the step of extracting embedding normally. this is the log:

/root/anaconda3/envs/pyannote_2/lib/python3.8/site-packages/pyannote/audio/embedding/approaches/arcface_loss.py:170: FutureWarning: The 's' parameter is deprecated in favor of 'scale', and will be removed in a future release
  warnings.warn(msg, FutureWarning)
Loading labels: 0file [00:01, ?file/s]
Traceback (most recent call last):
  File "/root/anaconda3/envs/pyannote_2/bin/pyannote-audio", line 8, in <module>
    sys.exit(main())
  File "/root/anaconda3/envs/pyannote_2/lib/python3.8/site-packages/pyannote/audio/applications/pyannote_audio.py", line 366, in main
    app.train(protocol, **params)
  File "/root/anaconda3/envs/pyannote_2/lib/python3.8/site-packages/pyannote/audio/applications/base.py", line 205, in train
    batch_generator = self.task_.get_batch_generator(
  File "/root/anaconda3/envs/pyannote_2/lib/python3.8/site-packages/pyannote/audio/embedding/approaches/base.py", line 111, in get_batch_generator
    return SpeechSegmentGenerator(
  File "/root/anaconda3/envs/pyannote_2/lib/python3.8/site-packages/pyannote/audio/embedding/generators.py", line 99, in __init__
    total_duration = self._load_metadata(protocol, subset=subset)
  File "/root/anaconda3/envs/pyannote_2/lib/python3.8/site-packages/pyannote/audio/embedding/generators.py", line 148, in _load_metadata
    support = Segment(start=0, end=current_file["duration"])
  File "/root/anaconda3/envs/pyannote_2/lib/python3.8/site-packages/pyannote/database/protocol/protocol.py", line 122, in __getitem__
    value = self.lazy[key](self)
  File "/root/anaconda3/envs/pyannote_2/lib/python3.8/site-packages/pyannote/audio/features/utils.py", line 56, in get_audio_duration
    with SoundFile(current_file["audio"], "r") as f:
  File "/root/anaconda3/envs/pyannote_2/lib/python3.8/site-packages/pyannote/database/protocol/protocol.py", line 122, in __getitem__
    value = self.lazy[key](self)
  File "/root/anaconda3/envs/pyannote_2/lib/python3.8/site-packages/pyannote/database/util.py", line 120, in __call__
    path_templates = self.config_[database]
KeyError: 'VoxCeleb'

I can't find the yml of ~/.pyannote/database.yml, so I run: vim ~/.pyannote/database.yml and added these configurations:

Databases:

      VoxCeleb:
        - /share/kongtianlong/VoxCeleb1/dev/wav/{uri}.wav
        - /share/kongtianlong/VoxCeleb1/test/wav/{uri}.wav
        - /share/kongtianlong/VoxCeleb2/dev/aac/{uri}.wav
        - /share/kongtianlong/VoxCeleb2/test/aac/{uri}.wav

So I I guess there is a problem with voxceleb data config, but I don’t know how to modify it. Can you help me? Thanks!

To Reproduce Steps to reproduce the behavior:

$ pyannote-audio emb train --subset=train --to=250 --parallel=8 ${EXP_DIR} VoxCeleb.SpeakerVerification.VoxCeleb2

Content of config.yml

feature_extraction:
   name: pyannote.audio.features.RawAudio
   params:
      sample_rate: 16000

data_augmentation:
   name: pyannote.audio.augmentation.noise.AddNoise
   params:
     snr_min: 5
     snr_max: 15
     collection:
       - MUSAN.Collection.BackgroundNoise
       - MUSAN.Collection.Music

architecture:
   name: pyannote.audio.models.SincTDNN
   params:
      sincnet:
         stride: [5, 1, 1]
         waveform_normalize: True
         instance_normalize: True
      tdnn:
         embedding_dim: 512
      embedding:
         batch_normalize: False
         unit_normalize: False

task:
   name: AdditiveAngularMarginLoss
   params:
      margin: 0.05
      s: 10
      duration: 2.0
      per_fold: 256
      per_label: 1
      per_epoch: 5
      per_turn: 1
      label_min_duration: 30

scheduler:
   name: ConstantScheduler
   params:
      learning_rate: 0.01

pyannote environment

$ pip freeze | grep pyannote
pyannote.audio==1.1.1
pyannote.core==4.1
pyannote.database==4.1
pyannote.db.voxceleb==1.2
pyannote.metrics==3.0.1
pyannote.pipeline==1.5.2

Additional context Add any other context about the problem here.

pyannote / pyannote-audio

There is a problem with this module “pyannote-audio emb train ” #657