pyannote / pyannote-audio

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
http://pyannote.github.io
MIT License
6.41k stars 789 forks source link

Fix validation preparation issue when a protocol does not define a development set #1717

Closed clement-pages closed 6 months ago

clement-pages commented 6 months ago

When using the following protocol, with no development subset defined :

EmbeddingsTaskOnlyMini:
    train:
        VoxCeleb.SpeakerVerification.VoxCelebMini: [train, ]
    development:
        # no development subset defined

I encountered this error:

Traceback (most recent call last):
  File "/gpfsdswork/projects/rech/bvr/uaf83xi/eend_2/joint/test_train.py", line 40, in <module>
    trainer.fit(model)
  File "/gpfsdswork/projects/rech/bvr/uaf83xi/micromamba/envs/pyannote/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 544, in fit
    call._call_and_handle_interrupt(
  File "/gpfsdswork/projects/rech/bvr/uaf83xi/micromamba/envs/pyannote/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 44, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
  File "/gpfsdswork/projects/rech/bvr/uaf83xi/micromamba/envs/pyannote/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 580, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/gpfsdswork/projects/rech/bvr/uaf83xi/micromamba/envs/pyannote/lib/python3.9/site-packages/pytorch_lightning/trainer/trainer.py", line 941, in _run
    self._data_connector.prepare_data()
  File "/gpfsdswork/projects/rech/bvr/uaf83xi/micromamba/envs/pyannote/lib/python3.9/site-packages/pytorch_lightning/trainer/connectors/data_connector.py", line 100, in prepare_data
    call._call_lightning_module_hook(trainer, "prepare_data")
  File "/gpfsdswork/projects/rech/bvr/uaf83xi/micromamba/envs/pyannote/lib/python3.9/site-packages/pytorch_lightning/trainer/call.py", line 157, in _call_lightning_module_hook
    output = fn(*args, **kwargs)
  File "/gpfsdswork/projects/rech/bvr/uaf83xi/pyannote-audio/pyannote/audio/core/model.py", line 198, in prepare_data
    self.task.prepare_data()
  File "/gpfsdswork/projects/rech/bvr/uaf83xi/pyannote-audio/pyannote/audio/core/task.py", line 596, in prepare_data
    self.prepare_validation(prepared_data)
  File "/gpfsdswork/projects/rech/bvr/uaf83xi/pyannote-audio/pyannote/audio/tasks/segmentation/mixins.py", line 293, in prepare_validation
    get_dtype(max(v[0] for v in validation_chunks)),
ValueError: max() arg is an empty sequence

This PR fixes the issue.