Bug: Voice activity detection import fails

900miles commented 3 weeks ago

Description

When running the voice activity detection tutorial on google colab (possibly other environments as well, but I've only tested on colab so far), I get a ValidationError when importing the voice activity detection module.

Specifically the offending line is:

from senselab.audio.tasks.voice_activity_detection import detect_human_voice_activity_in_audios

Steps to Reproduce

Open the voice activity detection tutorial in google colab. Add a !pip install senselab to the top and run.

Expected Results

The tutorial should install senselab and import the modules without error.

Actual Results

During the imports code block:

ValidationError                           Traceback (most recent call last)

[<ipython-input-2-149eefe7e14a>](https://localhost:8080/#) in <cell line: 5>()
      3 
      4 from senselab.audio.data_structures.audio import Audio
----> 5 from senselab.audio.tasks.voice_activity_detection import detect_human_voice_activity_in_audios
      6 from senselab.utils.data_structures.model import PyannoteAudioModel
      7 from senselab.utils.data_structures.device import DeviceType

5 frames

[/usr/local/lib/python3.10/dist-packages/senselab/audio/tasks/voice_activity_detection/__init__.py](https://localhost:8080/#) in <module>
      1 """.. include:: ./doc.md"""  # noqa: D415
      2 
----> 3 from .api import detect_human_voice_activity_in_audios  # noqa: F401

[/usr/local/lib/python3.10/dist-packages/senselab/audio/tasks/voice_activity_detection/api.py](https://localhost:8080/#) in <module>
      6 
      7 from senselab.audio.data_structures.audio import Audio
----> 8 from senselab.audio.tasks.speaker_diarization.pyannote import diarize_audios_with_pyannote
      9 from senselab.utils.data_structures.device import DeviceType
     10 from senselab.utils.data_structures.model import PyannoteAudioModel, SenselabModel

[/usr/local/lib/python3.10/dist-packages/senselab/audio/tasks/speaker_diarization/__init__.py](https://localhost:8080/#) in <module>
      1 """.. include:: ./doc.md"""  # noqa: D415
      2 
----> 3 from .api import diarize_audios  # noqa: F401

[/usr/local/lib/python3.10/dist-packages/senselab/audio/tasks/speaker_diarization/api.py](https://localhost:8080/#) in <module>
      9 
     10 from senselab.audio.data_structures.audio import Audio
---> 11 from senselab.audio.tasks.speaker_diarization.pyannote import diarize_audios_with_pyannote
     12 from senselab.utils.data_structures.device import DeviceType
     13 from senselab.utils.data_structures.model import PyannoteAudioModel, SenselabModel

[/usr/local/lib/python3.10/dist-packages/senselab/audio/tasks/speaker_diarization/pyannote.py](https://localhost:8080/#) in <module>
     49 def diarize_audios_with_pyannote(
     50     audios: List[Audio],
---> 51     model: PyannoteAudioModel = PyannoteAudioModel(path_or_uri="pyannote/speaker-diarization-3.1", revision="main"),
     52     device: Optional[DeviceType] = None,
     53     num_speakers: Optional[int] = None,

[/usr/local/lib/python3.10/dist-packages/pydantic/main.py](https://localhost:8080/#) in __init__(self, **data)
    191         # `__tracebackhide__` tells pytest and some other tools to omit this function from tracebacks
    192         __tracebackhide__ = True
--> 193         self.__pydantic_validator__.validate_python(data, self_instance=self)
    194 
    195     # The following line sets a flag that we use to determine when `__init__` gets overridden by the user

ValidationError: 1 validation error for PyannoteAudioModel
revision
  Value error, path_or_uri (pyannote/speaker-diarization-3.1) or specified revision (main) is not a valid Hugging Face model [type=value_error, input_value='main', input_type=str]
    For further information visit https://errors.pydantic.dev/2.8/v/value_error

Additional Notes

No response

fabiocat93 commented 3 weeks ago

@900miles nice catch! I think that the issue here is that some huggingface models (e.g., pyannote/speaker-diarization-3.1) require the user to accept some conditions before you can use them. A potential solution to this make be adding something like this:

"Before start testing with our default model by pyannote-audio, we will ask you to:

accept their user conditions on hf.co/pyannote/speaker-diarization-3.1
accept their user conditions on hf.co/pyannote/segmentation-3.0
login using notebook_login below

from huggingface_hub import notebook_login
notebook_login()

900miles commented 3 weeks ago

That makes sense and I think is a good solution. However we should raise that error only when the user attempts to run the model, not during import. During import the model isn't specified, so having this error could be confusing. Especially if the user wants to use a model other than pyannote

fabiocat93 commented 3 weeks ago

That makes sense and I think is a good solution. However we should raise that error only when the user attempts to run the model, not during import. During import the model isn't specified, so having this error could be confusing. Especially if the user wants to use a model other than pyannote

I definitely agree in principle. i guess that the error is raised because the code tries to connect to the hub to check if the model exists and since the user is not authorized to use it, it's not even authorized to know... ideas on how to solve this?

sensein / senselab